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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, such as 
lymphokines, interferons, circulating soluable factors ("CSF"), chemokines, and interleukins) 
has maturec! rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides **directly" in the sense that they rely on 
information directly related to the discovered protein (z.e., partial DNA/amino acid sequence of 
the protein in the case of hybridization cloning; activity of the protein in the case of expression 
cloning). More recent "indhrect" cloning techniques such as signal sequence cloning, which 
isolates DNA sequences based on the presence of a now well-recognized secretory leader 
sequence motif, as well as various PCR-based or low stringency hybridization-based cloning 
techniques, have advanced the state of the art by making available large numbers of DNA/amino * 
acid sequences for proteins that are known to have biological activity, for example, by virtue of 
their secreted nature in the case of leader sequence cloning, by virtue of their cell or tissue source 
in the case of PCR-based techniques, or by virtue of structural similarity to other genes of known 
biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 
and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
cloned genes or degenerate variants thereof, especially naturally occurring variants such as allelic 
variants, antisense polynucleotide moleculeS;,^and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 
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The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contam such 
polynucleotides and cells genetically engineered to express such polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic acid 
5 sequence assembled j&om expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization (SBH), and in some cases, sequences obtained from one or more public databases. 
The invention relates also to the proteins encoded by such polynucleotides, along with therapeutic, 
diagnostic and research utilities for these polynucleotides and proteins. This nucleic acid sequence 
is designated as SEQ ID NO: 1 . The polypeptide sequences are designated SEQ ID NO: 2 and 3. 
1 0 The nucleic acids and polypeptides are provided in the Sequence Listing. In the nucleic acids 

provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine; T is thymine; and N 
is any of the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
15 hybridize to the complement of SEQ ID N0:1 under stringent hybridization conditions; nucleic acid 
sequences which are allelTc variants or species homologues of any of the nucleic acid sequences 
recited above, or nucleic acid sequences that encode a peptide comprising a specific domain or 
truncation of the peptide encoded by SEQ ID NO: 1 . A polynucleotide comprising a nucleotide 
sequence having at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 
20 93%, 94%, 95%, 96%, 97%, 98% or 99% identity to an identifying sequence of SEQ IDN0:1 or a 
degenerate variant or fragment thereof The identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information 
from the nucleic acid sequences of SEQ ID NO: 1 . The sequence information can be a segment of 
any one of SEQ ID NO:i that uniquely identifies or represents the sequence infomnation of SEQ ID 
25 N0:1. 

A collection as used in this application can be a collection of only one polynucleotide. The 
collection of sequence information or identifying information of each sequence can be provided on 
a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleic acid anay to detect the polynucleotide tiiat contains tiie segment. The array can be designed 
30 to detect full-match or mismatch to tiie polynucleotide tiiat contains the segment. The collection 
can also be provided in a computer-readablefbrmat 

This invention also includes the reverse or direct complement of any of the nucleic acid 
sequences recited above; cloning or expression vectors containing the nucleic acid sequences; and 
host cells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
35 reverse or direct complements) accordingto the invention have numerous applications in a variety 
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of techniques known to those skilled in the art of molecular biology, such as use as hybridization 
probes, use as primers for PGR, use m an array, use in computer-readablemedia, use in sequencing 
full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 
5 In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 or novel segments 

or parts of the nucleic acids of the invention are used as primers in expression assays that are well 
known in the art. In a particularly preferred embodiment, the nucleic acid sequences of SEQ ID 
NO: 1 or novel segments or parts of the nucleic acids provided herein are used in diagnostics for 
identifying expressed genes or, as well known in the art and exemplified by VoUrath et aL, Science 

1 0 258 : 52-59 (1 992), as expressed sequence tags for physical mapping of the human genome. 
The isolated polynucleotides of the invention include, but are not limited to, a 
polynucleotidecomprising any one of ttie nucleotide sequences set forth in SEQ ID N0:1 ; a 
polynucleotidecomprising any of the full length protein coding sequences of SEQ ID N0:2; a 
polynucleotidecomprising any of the nucleotide sequences of the mature protein coding sequences 

1 5 of SEQ ID N0:2; a polynucleotide comprising any of the fiill length protein coding sequences of 
SEQ ID N0:3; and a polynucleotide comprising any of the nucleotide sequences of the mature 
protein coding sequences of SEQ ID N0:3 . The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent hybridization 
conditions to (a) the complement of any one of the nucleotide sequences set forth in SEQ ID NO: 1 ; 

20 (b) a nucleotide sequence encoding any one of the amino acid sequences set forth in the Sequence 
Listing; (c) a polynucleotide which is an allelic variant of any polynucleotides recited above; (d) a 
polynucleotide which encodes a species homolog {e.g. orthologs) of any of the proteins recited 
above; or (e) a polynucleotide that encodes a polypeptide comprising a specific domain or 
truncation of any of the polypeptides comprising an amino acid sequence set forth in the Sequence 

25 Listing. 

The isolated polypeptides of the invention mclude, but are not limited to, a polypeptide 
comprising any of the amino acid sequences set forth in the Sequence Listing; or the corresponding 
full length or mature protein. Polypeptides of the invention also mclude polypeptides with 
biological activity that are encoded by (a) any of the polynucleotideshaving a nucleotide sequence 

30 set forth in SEQ ID N0:1 ; or (b) polynucleotidesthat hybridize to the complement of the 

polynucleotides of (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any of the polypeptide sequences in the Sequence Listing, and "substantial 
equivalents"thereof (e.g., with at least about 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 
86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acid 

3 5 sequence identity) that preferably retain biological activity are also contemplated. The polypeptides 
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of the invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
5 hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 

1 0 under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Prefened embodunents include those in which the 
protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 

15 as hybridization probes, use as oligomers, or primers, for PGR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation olfaifi^^iSise 
or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 

20 using, e.g,, in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Volkath et aL, Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

25 The polypeptides according to the invention can be used in a variety of conventional 

procedures and methods that are currently applied to other proteins. For example, a polypeptide 
of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide m tissue. The polypeptides of the invention can also be used as molecular weight 

30 markers, and as a food supplement 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
which comprises the step of administering to a mammalian subject a therapeutically effective 
amoimt of a composition comprismg a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 
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In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
5 polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, be 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 
a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 

1 0 interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 
invention in a sample comprising contacting the sample with a compound that binds to and forms 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 

1 5 and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
20 monitoring tiie progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention ^so provides methods for the identification of compounds that modulate 
(?.e., increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 
of the invention. Such methods can be utilized, for example, for the identification of compounds 

25 that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e,g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 
compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a polypeptide / 

30 compoimd complex, wherein the complex drives expression of a reporter gene sequence in the 
cell; and detecting the complex by detecting the reporter gene sequence expression such that if 
expression of the reporter gene is detected the compound the binds to a polypeptide of the 
invention is identified. 

The methods of the invention also provides methods for treatment which involve the 

35 administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
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symptoms or tendencies* In addition, the invention encompasses methods for treating diseases or 
disorders as recited herein comprising administering compounds and other substances that 
modulate the overall activity of the target gene products. Compounds and other substances can 
effect such modulation either on the level of target gene/protein expression or target protein 
5 activity. 

The polypeptides of the present invention atid the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Example 4). The polypeptides and 
polynucleotides of the present invention are. also useful for a variety of applications, as described 
1 0 herein, including use in arrays for detection. 

4. DETAILED DESCMPTION OF THE INVENTION 

4.1 DEFINITIONS 

1 5 It must be noted that as used herein and in the appended claims, the singular forms "a", 

"an" and ''the" include plural references unless the context clearly Sctates otherwise. 

The term "active" refers to those forms of the polypeptide which retain the biologic 
and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
20 having structural, regulatory or biochemical fimctions of a naturally occurring molecule. 

' Likewise "immunologically active" or "immunological activity" refers to the capability of the 
natural, recombinant or synthetic polypeptide to induce a specific immxme response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells'* as used in this application are those cells which are engaged in 
25 extracellular or intracellular membrane trafScking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 

The terms "complementary" or "complementarity" refer to the natural binding of 
polynucleotides by base pairing. For example, the sequence 5'-AGTC-3' binds to the 
complementary sequencfe 3'-TCAG-5\ Complementarity between two single-stranded 
30 molecules may be "partial" such that only some of the nucleic acids bind or it may be 

"complete" such that total complementarity exists between the single stranded molecules. The 
degree of complementarity between the nucleic acid strands has significant efifects on the 
efficiency and strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
35 differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
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Stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
and continuous source of gerai cells for the production of gametes. The temi "primordial germ 
cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly 
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that have the potential to 
5 differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 
not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but'are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides which 

1 0 modulates the expression of an operably linked ORP or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 
sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 

1 5 operably linked ORF in response to a specific regulatory factor or physiological event. 
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 
"oligonucleotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 
sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 

20 antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-Iike material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 

25 from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising ' 
regulatory elements derived from a microbial or viral operon, or a exikaryotic gene. 

The terms "oligonucleotide fragment" or a*"polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer*' are used interchangeably and refer to a sequence of nucleotide 

30 residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 
most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 

35 nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
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preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
. be used in polymerase chain reaction (PGR), various hybridization procedures or microanay 
procedures to identify or amplify identical or related parts of mRNA or DNA molecules. A 
5 fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention.. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al (Walsh, P.S. et al, 1992, PGR Methods Appl 1 :241-250). They may 

10 be labeled by nick translation, Klenow fill-in reaction, PGR, or other methods well known in the 
art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al, 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, P.M. et al, 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 

15 entirety. 

The nucleic acid sequences of the present invention also includelEe^sequ^ 
information from the nucleic acid sequences of SEQ ID NO: 1 . The sequence information can be 
a segment of any one of SEQ ID N0:1 that uniquely identifies or represents the sequence 
infoimation of that sequence of SEQ ID NO: 1 . One such segment can be a twenty-mer nucleic 

20 acid sequence because the probability that a twenty-mer is fiilly matched in the human genome is 
1 in 300. In the human genome, there are three billion base pairs in one set of chromosomes. 
Because 4^*^ possible twenty-mers exist, there are 300 times more twenty-mers than there are 
base pairs in a set of human chromosomes. Using the same analysis, the probability for a 
seventeen-mer to be fully matched in the human genome is approximately 1 in 5. When these 

25 segments are used m arrays for expression studies, fifteen-mer segments can be used. The 

probability that the fifteen-mer is fully matched in the expressed sequences is also approximately 
one in five because e^ressed sequences comprise less than approximately 5% of the entire 
genome sequence. 

Sunilarly, when using sequence information for detecting a single mismatch, a segment can 
30 be a twenty-five mer. The probability that the twenty-five mer would appear in a human genome 
with a single mismatch is calculated by mxiltiplying the probability for a full match (l■^4'") times the 
increased probability for mismatch at each nucleotide position (3 x 25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for e5q)ression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
3 5 detected in a human genome is approximately one in five. 
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The term "open reading frame," ORF, means a series of nucleotide triplets coding for 
amino acids without any temiination codons and is a sequence translatable into protein. 

The tenns "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control, 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
1 0 diflFerentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
15 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from.about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a'sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence; The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 

9 
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The term "derivative" refers to polypeptides cheimcally modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
5 in human proteins. 

The term "variant'Xor "analog**) refers to any polypeptide differing firom naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 

10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code. Various codon 

1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding aflBnities, interchain 

20 affmities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, /.e., conservative amino 
acid replacements. "Conservative** amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 

25 nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 

30 "deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 

amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 
Alternatively, where alteration of function is desired, insertions, deletions or 

35 non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 
5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 

1 0 macromolecules, e.g, , polynucleotides, proteins, etc. In one embodiment, the polynucleotide or 
polypeptide is purified such that it constitutes at least 85% by weight, at least 90% by weight, 
more preferably at least 95% by weight, most preferably at least 99% by weight, of the indicated 
biological macromolecules present (but water, buffers, and other small molecules, especially 
molecules having a molecular weight of less than 1000 daltons, can be present). 

1 5 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in its natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant," when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived firom recombinant (e.g., microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E, coli, will be jfree of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different fix)m those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide firom a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, ^ere recombinant 
protein is expressed without a leader or transport sequence, it may include an amino temainal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably integrated 
a recombinant transcriptional imit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 

10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant ejqpression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 

1 5 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or trough a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is 
expressed in a suitable host cell. "Secreted" proteins include without limitation proteins secreted 
wholly (e.g., soluble proteins) or partially (e.g., receptors) from the cell m which they are 

20 expressed. "Secreted" proteins also include without limitation proteins that are transported 
across the membrane of the endoplasmic reticulum. "Secreted" proteins are also intended to 
include proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, 
P.A. and Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells 
{e.g. Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (199Z) Amu. Rev. Immunol. 

25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided fix>m 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can mclude highly stringent conditions (/.e., hybridization 
to filter-bound DNA in 0.5 M NaHP04, 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing m O.IX SSC/0.1% SDS at 68^C), and moderately stringent conditions (i.e., 
washing in 0.2X SSC/0.1% SDS at 42'*C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37**C (for 
14-base oligonucleotides), 48*^0 (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60*^0 (for 23-base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 

1 0 35% (/. e. , the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 

1 5 listed sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more tiian 25% (75% sequence identity); and in a further variation of this 
embodiment, by no more than 20% (80% sequence identity) and in a further variation of this 
embodiment, by no more tiian 10% (90% sequence identity) and in a further variation of tins 
embodiment, by no more tiiat 5% (95% sequence identity). Substantially equivalent, e.g., 

20 mutant, amino acid sequences according to the invention preferably have at least 80% sequence 
identity with a listed amino acid sequence, more preferably at least 90% sequence identity. 
Substantially equivalent nucleotide sequences of the invention can have lower percent sequence 
identities, taking into account, for example, the redundancy or degeneracy of the genetic code. 
Preferably, nucleotide sequence has at least about 65% identity, more preferably at least about 

25 75% identity, and most preferably at least about 95% identity. For the pxuposes of the present 
invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent. For the purposes of 
detennirung equivalence, truncation of the mature sequence (e,g. , via a mutation which creates a 
spurious stop codon) should be disregarded. Sequence identity may be determined, e.g., using 

30 Ihe Jotun Hein method (Hein, J. (1 990) Methods EnzymoL 1 83 :626-645). Identity between 
sequences can also be determined by other methods known in the art, e.g. by varying 
hybridization conditions. 

The term "totipotent" refers to flie capability of a cell to differentiate into all of the cell 
types of an adult organism. 
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The term "transfonnation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element, or by chromosomal integration. The 
term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in fact e5q)ressed. The term "infection" refers to the introduction 
5 of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As xxsed herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
1 0 suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
with an appropriate host imder appropriate conditions and the uptake of the marker sequence is 
determined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terms is meant to encompass all that is described for each, unless the 
1 5 context dictates otherwise. 



4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising the 

20 nucleotide sequences of Table 1 (SEQ ID N0:1); a polynucleotide encoding any one of the 
peptide sequences of Table 2 (SEQ ID N0:2) or Example 2 (SEQ ID N0:3); and a 
polynucleotide comprising the nucleotide sequence encoding the mature protein coding sequence 
of the polynucleotides of any one of SEQ ID N0:2 or 3. The polynuclieotides of the present 
invention also include, but are not limited to, a polynucleotide that hybridizes under stringent 

25 conditions to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1 ; (b) 
nucleotide sequences encoding any one of the amino acid sequences set forth in the Sequence 
Listing; (c) a polynucleotide which is an allelic variant of any polynucleotide recited above; (d) 
a polynucleotide which encodes a species homolog of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of the 

30 polypeptide of SEQ ED N0:2 or 3. Domains of interest may depend on the nature of the encoded 
polypeptide; e.g., domains in receptor-like polypeptides include ligand-binding, extracellular, 
transmembrane, or cytoplasmic domains, or combinations thereof; domains m immunoglobulin- 
like proteins include the variable immunoglobulin-like domains; domains in enzyme-like 
polypeptides include catalytic and substrate binding domains; and domains in ligand 

3 5 polypeptides include receptor-binding domains. 
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Table 1 : 784^3137 contig: Nucleotide sequence (SEQ ID NO: 1) 

CCOVCGCGTCCGCCCACGCGTCCGTACACSACCACTUVTTACTATATATCTCGAATATATGGTC^^ 

CTGCCAGCCQQGATTTATQGGTGAACATAQACCAAATGGAAAAAGATAAAGTGAAGATTCAT^^ 

TCCAATACTCATCGGCAAGCTGOUlGAGTGAATCTGTCCTTCGATTTTCt^ 

TQAAATCACTGTGGCAACCGGQGGTTTCATATACACTGGAGAAGTCGTACATCGAATGCT^ 

AQTACATAGCACCTTTAATCGCAAATTTCGATCCCTiLGTGTATCCAGAAATTCyACTG^ 

AATGGCACAGCACTTGTGGTCCAQTGGGACCaTGTACATCTCCAGGATAATTATAACCTGGGAAGCTTCAC 

ATTCCAGGCAACCCTGCTCATGGATQQACGAATCATCTTTGQATACAAAGAAATTCCTGTCTTGGTCAC^ 

AGATAAGTTCAACa^TCATCCAQTGAAAGTCGGACTGTCCGATGCATTTGTCGTTGTCCAC^^ 

CAAATTCCCTlATGTTCGAAGAAGAACAATTTATGAATACCACCGAGTAGAGCTACy^TG^ 

CAACy^TTTaSGCTGTGQAQATGACCCCy^TTACCCACaLTQCCTCCAGTTTAACAQAT 

CTTCTCAGATTGGCTTCAACTGCAGTTGGTGTAGTAAACTTCAAAGATGTTCC^ 

CGGCAGGACTGGGTGGACy^GTGGATGCCCTGAAGAGTCAAAAGAGAAGATGTGTGAGAATACAGAACCAGT 

GGAAACTTCTTCTCGAACCACCACTUVCCATAGGAGCGACAACCACCCAGTTCAGGGTCCTA^ 

GAAGAGCAGTGACTTCTOVGTTTCCCACCAGCCTCCCTACAGAAGATGATACCAAGATAGCACTA^ 

AAAGATAATGGAGCTTCTACAQATGACyVGTGCAaCTGAGAAQAAAGGGGGAACCCTCCACGCTGGCCTC 

CGTTGGAATCCTCATCCTGGTCCTCATTGTAGCCACAGCCATTCTTGTGAC^ 

CAACATCAGO^GCCAGCATCTTCTTTATTGAGAGACGCCCAAGCy^GATGGCCTO 

GGCTCTGGACATCCTGCCTATGCTGAAGTTGAACCAGTTQGAGAQAAAGAAaQCTTTATTGTATC^ 

GTGCTAAAATTTCTAGGACAGAACAAO^CCAGTACTGGTTTACAGGTGTTAAGA 

CCTTTAAGACy^CAAACAAACa^CACACACAAACaAGCT 

TCTGGAO^GCTCAGCCCAGGAAACy^GGGTAAACAAAAAACTAAAACTTATAC^ 

TGAACATAGAATTCCCTAGTGGAATGTCATCTATAGTTCACTCGGAACATCTCCCGTGGACTTATCT6AAG 

TATGAO^QATTATAATGCmTGGCTTAGGTGCAGGGTTGCAAAGGGAT^ 

AAGCTTTAGTTCATGAGGGATCGACACCTTTGGTTCAAATGTTCTCTGATGTCTCAAAGATAACTGTTTTC 

CyUU^CCTGAACCCTTTCyiCTCAAAAGAGCAATGATGAATGTCTCAAGATTGCTA^ 

CAAGAQTQAQAACAAACACAAAATAAGAGATTTTCTACATTTTCa^AAACAGATGTGTO 

GTTTTTCTGGTCTAGATCCATCTQTACCAACAAGTTCATCACTTTACAGAACaAATCT 

GGAGGTTCAAACCATGTCTGCCTCTTCCTTTGTAATGAATGACCTTTCTATGAGCTGTGACAA^ 

AAa^TTAGCTAAGGATTaXBGQAAGAGGGGGTGGCAAACGGGQCTTTCTGTTTTCCTQCCT^^ 

ACATCTGATTTATGCTTTATGGAAGCCTTACCTCCAATCCCCAACTGTTAAGTCCCATGAAACC^ 

CTCTGGGCTGATGGAAACAAAAGGAAACAGTATGAAGAGTTCCTTAATCATrTTTGAAA^^ 

GGGATTTTAAACATATGATTATTTTTAATTTTATGCCTTTTCAGTACTAAACACCCATTTCATTO 

CCTGTCTAAGAAGCCATTCACGTCAGO^TGGCGATAGAAAGAATGAAAAAACCCTGCTGAATCATACAGTA 

ATTTTCTTTTIAAGCACATAGTAGTTACATAAATATATATATATAAATATATTTTTGOT^ 

AGGCAGGATCTTGTGACTCTAAGAGTGCGTTTTGTCATCAAGACAAAAO^GATGC^ 

TTACTTCaVTAGAGTTGTAAAATAATCCTTAATATTAGAATATTTTTCTCTCaiCTTAGCyU^ 

GTTCATTGCCGCGCCCATCATQTTCTTGACTATTTGATCCACTTTTTCGTTTATGTCyUV^ 

CTGGCTAAATAAAGTGGATGCAGAAAGCTCCTTAAATGGAA 

Table 2: 784_3137 contig: Encoded polypeptide sequence (SEQ ID N0:2) 

PRVRPRVRTDHir!nfrSRIYGPSDSASRDLWVNIIXJMEKDK\^I 
EITVATGGFIYTGEnA/HRMLTATQYIAPLMANFDPSVSIUTSTVRYPDNaTAIjWQTO 
PQATLLMDGRI IFGYKEIPVLVTQISSTNHPVKVGLSDAFVVVHR 

NISAVEMTPLPTCIiQFlSnRCQPCVSSQIGFNCSWCSKLQRCSSGFDRHRQDWVDSGCPBESKE 

ETFLEPPQPBRQPPSSGSLPPEDAWSQPPTSIiPTEDDTKIALHLKDNGTlSTDDSAT^EKKGGTIJ^ 

ILILVLIVATAILVTVYimfflPTSAASIFFIERRPSRWPAMKPRRGSGHPAYi^^ 



Hie polynucleotides of the invention include naturally occurring or wholly or partially 
5 synthetic DNA, e.gl, cDNA and genomic DNA, and RNA, e.g., mRNA, The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences disclosed 

herein. The corresponding genes can be isolated in accordance with known methods using the 
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sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence information for identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5* and 3* sequence can 
be obtained using methods known in the art For example, full length cDNA or genomic DNA that 
5 corresponds to any of the polynucleotides of SEQ ID NO: 1 can be obtained by screening 

appropriate cDNA or genomic DNA libraries under suitable hybridization conditions using any of 
the polynucleotides of SEQ ID NO: 1 or a portion thereof as a probe. Alternatively, the 
polynucleotides of SEQ ID NO: 1 may be used as the basis for suitable primer(s) that allow 
identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 
1 0 The nucleic acid sequences of the invention can be assembled from ESTs and sequences 

(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fi:agment or segment information, or novel segment information for the fiill-length 
gene. 

1 5 The polynucleotides of the invention also provide polynucleotides including nucleotide 

sequences that are substantially equivaTent to the polynxicleotides recited above. Polynucleotides 
according to the invention can have, e.g. , at least about 65%, at least about 70%, at least about 
75%, at least about 80%, more typically at least about 90%, and even more typically at least 
about 95%, sequence identity to a polynucleotide recited above. 

20 Included within the scope of the nucleic acid sequences of the invention are nucleic acid 

sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO: 1 , or complements thereof, which fragment is greater tiian about 5 nucleotides, 
preferably 7 nucleotides, more preferably greater than 9 nucleotides and most preferably greater 
than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more that are selective for 

25 (ie, specifically hybridize to any one of the polynucleotides of the invention) are contemplated. 
Probes capable of specifically hybridizing to a polynucleotide can differentiate polynucleotide 
sequences of the invention from other polynucleotide sequences in the same family of genes or 
can differentiate human genes from genes of other species, and are preferably based on unique 
nucleotide sequences. 

30 The sequences falling within the scope of the present invention are not limited to these 

specific sequences, but also include allelic and species variations thereof Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO: 1 , a 
representative firagment thereof, or a nucleotide sequence at least 90% identical, preferably 95% 
identical, to SEQ ID NO: 1 with a sequence from another isolate of the same species. Furtheimore, 

35 to accommodate codon variability, the invention includes nucleic acid molecules coding for the 
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same amino acid sequences as do the specific ORFs disclosed herein. In other words, in the coding 
region of an ORF, substitution of one codon for another codon that encodes the same amino acid is 
expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present invention, 
5 including SEQ ID NO: 1, can be obtained by searching a database using an algorithm or a program. 
Preferably, a BLAST which stands for Basic Local Alignment Search Tool is used to search for 
local sequence alignments (Altshul, S JF. JMol Evol 36 290-300 (1993) and Altschul S J. et al J, 
MolBiol 21:403-410(1990)). Alternatively a FASTA version 3 search against Genpept, using 
Fastxy algorithm. 

1 0 Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 

provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers jfrom the sequences provided herein and screening a suitable nucleic 
acid source fi-om the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides or ■ 

1 5 proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention are further directed to sequences which 
encode variants of the described nucleic acids. These amino acid sequence variants may be 

20 prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 

25 acid alterations can be made at sites that differ in the nucleic acids fi:om different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices (e.g., 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 
choices (e.g:, hydrophobic amino acid to a charged amino acid), and then deletions or insertions 

30 may be made at the tai^et site, Amino acid sequence deletions generally range fi:om about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 
residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

35 preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
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sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences are 
changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
5 polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 

nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed. In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al, 
DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 

10 polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 10:6487-6500 
(1982), PGR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amoimts of template DNA are used as starting material, primer(s) that differs 
slightly in sequence from the corresponding region in the template DNA can generate the desired 
amino acid variant PGR amplification results in a population of product DNA fragments that 

1 5 differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant, 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al.. Gene 34:315 (1985); and other mutagenesis techniques well 

20 known in the art, such as, for example, the techniques in Sambrook et al, supra, and Current 
Protocols in Molecular Biology^ Ausubel et ah Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
of these novel nucleic acids. Such DNA sequences include those which are capable of 

25 hybridizing to the appropriate novel nucleic acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
domains of the invention and heterologous protein sequences. 

The polynucleotides of the invention additionally include the complement of any of the 

30 polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 
conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 

35 protein coding sequences conesponding to any one of SEQ ID N0:1, or functional equivalents 
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thereof, may be used to generate recombinant DNA molecules that direct the expression of that 
nucleic acid, or a functional equivalent thereof, in appropriate host cells. Also included are the 
cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the mvention can be joined to any of a variety of other 
5 nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 

10 invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 

15 organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO:lor a fragment thereof or any other 
polynucleotides of the invention. In one embodiment, the recombinant constructs of the present 
invention comprise a vector, such as a plasmid or viral vector, into which a nucleic acid having 

20 any of the nucleotide sequences of SEQ ID NO: 1 or a fragment thereof is inserted, in a forward 
or reverse orientation. In the case of a vector comprising one of the ORFs of the present 
invention, the vector may further comprise regulatory sequences, including for example, a 
promoter, operably linked to the ORF. Large numbers of suitable vectors and promoters are 
known to those of skill in the art and are commercially available for generating the recombinant 

25 constructs of the present invention. The following vectors are provided by way of example. 
Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, pNH16a, pNH18a, 
pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). 
Eukaryotic: pWLneo;pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

30 The isolated polynucleotide of tiie invention may be operably linked to an expression 

control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al. 
Nucleic Acids Res, 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 
suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in Kaufinan, Methods in Emymology 

35 185, 537-566 (1990). As defined herein "operably linked" means that the isolated 
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polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 
(transfected) with the ligated polynucleotide / expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 

5 transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 

10 Generally, recombinant expression vectors will include origins of replication and selectable 
markers pemiitting transformation of the host cell, e.g., the ampicillin resistance, gene ofE. coli 
and S. cerevisiae TRPl gene, and a promoter derived from a highly-expressed gene to direct 
transcription of a downstream structural sequence. Such promoters can be derived &om operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 

1 5 phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

20 characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by mserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 
signals in operable reading phase with a functional promoter. The vector will comprise one or 
more phenot5T)ic selectable markers and an origin of replication to ensure maintenance of the 

25 vector and to, if desirable, provide amplification within the host. Suitable prokaryotic hosts for 
transformation include E. coli. Bacillus subtilis, Salmonella typhimurium and various species 
within the genera Pseudomonas, Streptomyces, and Staphylococcus, although others may also be 
employed as a matter of choice. 

As a representative but non-limitmg example, useful expression vectors for bacterial use 

30 can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA). These 
pBR322 "backbone" sections are combmed with an appropriate promoter and the structural 

35 sequence to be expressed. Following transformation of a suitable host strain and growth of the 
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host strain to an appropriate cell density, the selected promoter is induced or derepressed by 
appropriate means (e.g., temperature shift or chemical induction) and cells are cultured for an 
additional period. Cells are typically harvested by centrifiigation, disrupted by physical or 
chemical means, and the resulting crude extract retained for further purification. 
5 Polynucleotides of the invention can also be used to induce immune responses. For 

example, as described in Fan et al, Nat Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
against the encoded polypeptide following topical administration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. The nucleic acid 
1 0 sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

4.3ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
1 5 are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID N0:1, or fragments, analogs or derivatives thereof. An "antisense" nucleic 
acid comprises a nucleotide sequence that is complementary to a "sense" nucleic acid encoding a 
protein, e.g., complementary to the coding strand of a double-stranded cDNA molecule or 
complementary to an mRNA sequence. In specific aspects, antisense nucleic acid molecules are 
20 provided that comprise a sequence complementary to at least about 1 0, 25, 50, 100, 250 or 500 
nucleotides or an entire coding strand, or to only a portion thereof Nucleic acid molecules 
encoding fragments, homologs, derivatives and analogs of a protein of any of SEQ ID N0:2 or 3, 
or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID NO: 1 are 
additionally provided. 

25 In one embodiment, an antisense nucleic acid molecule is antiseiise to a "coding region" 

of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
to the region of the nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence 6f the invention. The term 

30 "noncoding region" refers to 5' and 3* sequences which flank the coding region that are not 
translated into amino acids (/.e., also referred to as 5* and 3' untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g. , SEQ 
ID N0:1 , antisense nucleic acids of the invention can be designed according to the rules of 
Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be 

3 5 complementary to the entire coding region of a mRNA, but more preferably is an oligonucleotide 
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that is antisense to only a portion of the coding or noncoding region of a mRNA. For example, 
the antisense oligonucleotide can be complementary to the region surrounding the translation 
start site of a mRNA, An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 
30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 

5 constructed using chemical synthesis or enzymatic ligation reactions using procedures known in 
the art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 
chemically synthesized using naturally occurring nucleotides or variously modified nucleotides 
designed to increase the biological stability of the molecules or to increase the physical stability 
of the duplex formed between the antisense and sense nucleic adds, e,g., phosphorothioate 

1 0 derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xantiiine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminometiiyl- 
2-thiouridine, 5-carboxymethylaminometiiyluraciI, dihydrouracil, beta-D-galactosylqueosine, 

15 inosine, N6-isopentenyladenine, 1-methylguanine, 1-metiiylmosine, 2,2-dimethylguanine, 
2-metiiyladenine, 2-methylguanine, 3-methylcytosine, 5-metiiylcytosine, N6-adenine, 
7-metiiylguanine, 5-methylanunomethyluracil, 5-methoxyaininomethyl-2-thiouracil, 
beta-D-mannosylqueosme, 5'-methoxycarboxymethyluracil, 5-metiioxyuracil, 

2- metiiylthio-N6-isopentenylademne, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
20 queosme, 2-thiocytosine, 5-methyl-2-thiouracil, 2rtiiiouracil, 4-thiouracil, 5-metiiyluracil, 

uracil-5-oxyacetic acid mefliylester, uracil-5-oxyacetic acid (v), 5-metiiyl-2-tiiiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurme. Alternatively, tiie 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (/.^., RNA transcribed jQrom the 

25 inserted nucleic acid will be of an antisense orientation to a target nucleic acid of mterest, 
described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 

30 protein, e.g. , by inhibiting transcription and/or translation. The hybridization can be by 

conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
an antisense nucleic acid molecule that binds to DNA duplexes, tiirough specific interactions in 
the major groove of the double helix. An example of aroute of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

3 5 antisense nucleic acid molecules can be modified to target selected cells and then administered 
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systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g., 
by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
5 the vectors described herein. To achieve sufficient intracellular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
control of a strong pol II or pel III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
1 0 double-stranded hybrids with complementary RNA in which, contrary to the usual P-units, the 
strands run parallel to each other (GauMer et al (1987) Nucleic Acids Res 1 5: 6625-6641). The 
antisense nucleic acid molecule can also comprise a 2'-o-methybibonucleotide (Inoue et al, 
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al. (1987) 
FEES Lett 215: 327-330). 

15 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 

20 Thus, ribozymes (e.g. , hammerhead ribozymes (described in Haselhoff and Geriach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e., SEQ ID N0:1), 
For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the 

25 nucleotide sequence of the active site is complementary to the nucleotide sequence to be cleaved 
in a SECX-encoding mRNA. See, e.g., Cech et al U.S. Pat No. 4,987,071; and Cech etaL U.S. 
Pat No. 5,1 16,742. Alternatively, SECX mRNA can be used to select a catalytic RNA having a 
specific ribonuclease activity from a pool of RNA molecules. See, e.g, Bartel et al, (1993) 
Terence 261:1411-1418. 

30 Alternatively, gene expression can be inhibited by targeting nucleotide sequences 

complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer DrugDes. 6: 569-84; Helene. et al (1992) Ann. NY. Acad Sci. 660:27-36; and 
Maher (1992) 14; 807-15. 
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In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or 
solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
5 Chem 4: 5-23), As used herein, the terms "peptide nucleic acids'* or "PNAs" refer to nucleic acid 
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 

1 0 standard solid phase peptide synthesis protocols as described in Hyrup et aL (1 996) above; 
Perry-O'Keefe et al (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g., inducing transcription or translation arrest or inhibiting replication. 

1 5 PNAs of the invention can also be used, e.g. , in the analysis of single base pair mutations in a 
gene by, e.g. , PNA directed PGR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g., SI nucleases (Hyrup B. (1996) above); or as probes or 
primers for DNA sequence and hybridization (Hyrup et al (1996), above; Peny-OTKleefe (1996), 
above). 

20 In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 

stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras can be genemted that may 
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA recognition 

25 enzymes, e.g. , RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terms of base stacking, number of bonds between 
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1996) above and Finn et al. ( 1 996) Nud Acids Res 24 : 

30 3357-63 . For example, a DNA chain can be synthesized on a solid support using standard 

phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 5'-(4-methoxytrityl) 
amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA and the 5' end of 
DNA (Mag et al. (1989) Nud Acid Res 17: 5973-88). PNA monomers are then coupled in a 
stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' DNA segment 

35 (Finn et al (1 996) above). Alternatively, chimeric molecules can be synthesized with a 5* DNA 
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segment and a 3' PNA segment. See, Petersen et al (1975) BioorgMed Chem Lett 5: 
1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.^., for targeting host cell receptors in vivo), or agents facilitating transport across the 
5 cell membrane (see, e.g., Letsinger et a/., 1989, Proc. Natl Acad. Sci U.S,A. 86:6553-6556; 
Lemaitre etaL, Proc. Natl Acad Sci 84:648-652; PCT Publication No. W088/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. W089/I0134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (See, e.g., Krol et 
al, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
10 5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 

peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, e/c. 

4.5 HOSTS 

1 5 The present invention further provides host cells genetically engineered to contain the 

polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

20 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
increase, expression of endogenous polypeptide. Cells can be modified (e.g., by homologous 
recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 

25 naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operatively Imked to the encoding sequences. See, for example, PCT International Publication 
No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 
Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 

30 DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 

encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 
sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the ceUs. 
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The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 
calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
5 L. et aly Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

Any host/vector system can be used to express one or more of the ORFs of the present 

10 invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells are those which do not noraially express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 
be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 

1 5 promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
al, in Molecular Cloning; A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989), the disclosure of which is hereby incorporated by reference. 

20 Various mammalian cell culture systems can also be employed to express recombinant 

protem. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the C127, monkey COS cells, Chinese Hamster Ovary 
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 

25 cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strams derived 
from in vitro culture of primary tissue, primary explants, HeL?i cells, inouse L cells, BHK, 
HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenylation 
site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 

30 nontiranscribed sequences. £)NA sequences derived from the SV40 viral genome, for example, 
SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 
more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 

35 refolding steps can be used, as necessary, in completing configuration of the mature protein. 
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Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including fi:eeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 
agents. 

5 Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 

or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, Candida^ or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strams include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 

1 0 strain capable of expressing heterologous protems. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylation of the appropriate sites, in order to obtain the fimctional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 

1 5 express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a 
different gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 

20 regulatory sequences may be comprised of promoters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional mitiation sites, regulatory protein binding sites or 
combinations of said sequences, Altematively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence mclude polyadenylation signals, mRNA stability elements, splice 

25 sites, leader sequences for enhancing or modifying transport or secretion properties of the 
protein, or other sequences which alter or improve the fimction or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
30 enhancer or both upstream of a gene. Altematively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Altematively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 
the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
35 sequences are added. In all cases, the identification of the targeting event may be facilitated by 
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the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 
more marker genes exhibiting the property of negative selection, such that the negatively 

5 selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers usefiil for this purpose include the 
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 

1 0 phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin etai; International Application No. 
PCT/US92/09627 (WO93/09222) by Selden et al\ and International Application No. 

15 PCT/US90/06436 (WO91/06667) by Skoultchi et al, each of which is incorporated by reference 
herein in its entirety. 

4.6 FOLYPEPXroES OF THE IP4VENTION 

The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

20 comprising: the amino acid sequences set forth as any one of SEQ ID N0:2 or 3 or an amino 
acid sequence encoded by !SEQ ID NO:lor the corresponding full length or mature protein. 
Polypeptides of the invention also include polypeptides preferably with biological or 
immunological activity that are encoded by: (a) a polynucleotide having the nucleotide sequence 
set forth in SEQ ID N0:1; or (b) polynucleotides encoding any one of the amino acid sequences 

25 set forth as SEQ ID N0:2 or 3; or (c) polynucleotides that hybridize to the complement of the 
polynucleotides of either (a) or (b) under stringent hybridization conditions. The invention also 
provides biologically active or immunologically active vmants of any of the amino acid 
sequences set forth as SEQ ID N0:2 or 3 or the corresponding full length or mature protein; and 
"substantial equivalents" thereof (e.g., with at least about 65%, at least about 70%, at least about 

30 75%, at least about 80%, at least about 85%, at least about 90%, typically at least about 95%, 
more typically at least about 98%, or most typically at least about 99% amino acid identity) that 
retain biological activity. Polypeptides encoded by allelic variants may have a similar, 
increased, or decreased activity compared to polypeptides comprising SEQ ID N0:2 or 3. 
Fragments of the proteins of the present invention which are capable of exhibiting 

35 biological activity are also encompassed by the present invention. Fragments of the protein may 
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be in linear fonn or they may be cyclized using known methods, for example, as described in H. 
U. Saragovi, et al, Bio/Technology 10, 773-778 (1992) and in R, S. McDowell, et al, J, Amer. 
Chem. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
5 including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listmg by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 

1 0 polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also deteiminable from the amino acid sequence of the full-length form. Where 
proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are ftilly secreted from the cell in which they are expressed, 

1 5 Protein compositions of the present invention may further comprise an acceptable carrier, 

such as a hydrophilic, e.g., pharmaceutically acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the nucleic acid 
fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 

20 nucleic acid fragment of the present invention (e.g. , an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
acid fragments of the present invention are the ORFs that encode proteins . 

A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 

25 sequence can be synthesized using coirmercially available peptide synthesizers. The 

synthetically-constructed protein sequences, by virtue of sharing primary, secondary or teJrtiary 
structural and/or conformiational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 

30 example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 
therapeutic compoxmds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As used herein, a 

35 cell is said to be altered to express a desired polypeptide or protein when the cell, through 
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genetic manipulation, is made to produce a polypeptide or protein which it normally does not 
produce or which the cell normally produces at a lower level. One skilled in the art can readily 
adapt procedures for introducing and expressing either recombmant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 
culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 
culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 
methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
20 and immuno-afBnity chromatography. See, e.g. , Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, et al., in Molecular Cloning: A Laboratory 
Manual, Ausubel et al. Current Protocols in Molecular Biology. Polypeptide fragments that 
retain biological/immunological activity include fragments comprisiag greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
25 domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
3 0 activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., ricin or cholera, or with other compounds that are toxic to 
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cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID N0:2 or 3. 

The protein of the invention may also be expressed as a product of transgenic animals, 
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
5 by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in tihie art using known techniques. Modifications of interest in the protein 

1 0 sequences may include the alteration, substitution, replacement, insertion or deletion of a 

selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Patent No. 4,518,584). Preferably, such 

1 5 alteration, substitution, replacement, insertion or deletion retains the desired activity of the 

protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 

20 importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
retain protein activity in whole or in part and are useful for screening or other immunological 
methodologies may also be easily made by those skilled in the art given the disclosures herein. 

25 Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employmg 
an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form firom, e.g., Invitrogen, San Diego, Calif, U.S.A. 

30 (the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 {1987), incorporated herein by 
reference. As used herem, an insect cell capable of expressing a polynucleotide of the present 
invention is "transformed." 

The protein of the invention may be prepared by culturing transformed host cells under 

35 culture conditions suitable to express the recombinant protein. The resulting expressed protein 
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may then be pxmfied from such culture (i.e., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 
of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
5 heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 

hydrophobic interaction chromatography usmg such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

Alternatively, the protein of the invention may also be expressed in a form which will 
facilitate purification. For example, it may be expressed as a fusion protein, such as those of 

10 maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.X) and Invitrogen, 
respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 

1 5 available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 
aliphatic groups, can be employed to fiirfher purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 

20 homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protem." 

The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted, 

25 Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 

modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability . Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 

30 provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-cells, monocytes, dendritic cells, granulocytes, etc, as well 
as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
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steroids. Also, polypeptides may be fused to inunune modulators, and other cytokines such as 
alpha or beta interferon. 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOITDE IDENTITY 

5 AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
programs including, but are not limited to, the GCO program package, including GAP 
(Devereux. J., et al, Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group. 
10 University of Wisconsin, Madison. WI). BLASTP. BLASm BLASTX, FASTA (Altschul, S.F. 
et al., J. Molec. Biol 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al. Nucleic Acids Res. 
vol. 25. pp. 3389-3402. herein incorporated by reference), eMatrix software (Wu et al, J. Comp. 
Biol, Vol. 6, pp. 219-235 (1999). herein incorporated by reference), eMotif software (Nevill- 
Mamiing et aU ISMB-97, Vol. 4. pp. 202-209. herein incorporated by reference). pFam software 
15 iSor.r^^etal,NucleicAcidsRes.,Wol2eil),v?^^^^^ 

reference) and the Kyte-DooUttie hydrophobocity prediction algorithm {J. Mol Biol, 157, pp. 
105-31 (1982), incorporated herein by reference). Hie BLAST programs are pubUcly available 
from the National Center for Biotechnology Information (NCBI) and otiier sources (BLAST 
Manual, Altschul, S., et al NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al, J. Mol 
20 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS . 
Hie invention also provides chimeric or fusion proteins. As used herem, a chunenc 

protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
another polypeptide. Within a fusion protem the polypeptide according to tiie invention can 
25 conespondtoalloraportionofaproteinaccordingtotheinvention. In one embodiment, a 

fusion protein comprises at least one biologically active portion of a protem according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 

portions of a protein according to the invention. Withm the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the otiier 
polypeptide are fiised in-fiame to each otiier. Tlie polypeptide can be fusedto tiie N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 
tiie invention operably linked to the extracellular domam of a second protein. 
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In another embodiment, the fusion protein is a GST-fusion protein in which the polypeptide 
sequences of the invention are fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an inmiunoglobulin fusion protein in which 
5 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

1 0 The immimoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protetn interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e.g., cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

1 5 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

20 appropriate termini, fiUing-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PGR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

25 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
example, Ausubel et al (eds.) Current Protocols in Molecular Bioloov, John Wiley & 
Sons, 1992), Moreover, many expression vectors are commercially available that akeady encode 
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 

30 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal • 
35 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
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the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ^ or in vivo by use of vectors, and more particularly 
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g,, liposomes or chemical treatments). See, for example, 
5 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmaim, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature^ 357: 455-460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

1 0 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

15 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their conqjlements, or their translated RNA 
sequences, by methods known m the art. Further, the polypeptides of the present invention can be 

20 inhibited by using targeted deletion methods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association with 
a regulatory sequence heterologous to the host cell which drives expression of Ihe polynucleotides 

25 in the cell. These methods can be used to increase or decrease the expression of the polynucleotides 
of the present invaition. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified (e.g., by 
homologous recombination) to provide increased polypeptide expression by replacing, in whole or 

30 in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is mserted in such a maimer that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
PublicationNo. WO 94/1 2650, PCT International PublicationNo. WO 92/20808, and PCT 
Intemational Publication No. WO 91/09955. It is also contemplated that, in addition to heterologous 

3 5 promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
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encodes carbamyl phosphate synthase, aspartate transcarbamylase,and dihydroorotase) and/or 
intron DN A may be inserted along with the heterologous promoter DNA. If linked to the desired 
protem coding sequence, amplification of the marker DNA by standard selection methods results in 
co-amplification of the desired protein coding sequences in the cells. 

5 In another embodiment of the present invention, cells and tissues may be engineered to 

express an endogenoxxs gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herem, gene targeting can be used to 
replace a gene* s existing regulatory region with a regulatory sequence isolated from a different gene 

10 or a novel regulatory sequence synthesized by genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scaflfold-attachmentregions, negative 
regulatory elements, transcriptionalinitiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 
protein produced may be replaced, removed, added, or oflierwise modified by targeting. These 

1 5 sequences include polyadenylationsignals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretion properties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the gene 
under the control of the new regulatory sequence, e.g., inserting a new promoter or enhancer or both 

20 upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element for example, a tissue-specific enhancer can be 
replaced by an enhancer that has broader or different cell-type specificity than the naturally 
occurring elements. Here, the naturally occurring sequences are deleted and new sequences are 

25 added. In all cases, the identification of the targetmg event may be facilitated by the use of one or 
more selectable marker genes that are contiguous v^th the targeting DNA, allowing for the selection 
of cells in which the exogenous DNA has integrated uxto the cell genome. The identification of the 
targeting event may also be facilitated by tiae use of one or more marker genes exhibiting the 
property of negative selection, such that tiie negatively selectable marker is Unked to the exogenous 

30 DNA, but configured such that the negatively selectable marker flaxiks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in tiie stable integration of the negatively selectable marker. Markers useful for tiiis 
purpose include tiiie Herpes Simplex Virus thymidme kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 
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The gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S . Patent No. 5,272,07 1 to Chappel; 
U.S. Patent No. 5,578,461 to Shermaetal; International ApplicationNo. PCT/US92/09627 
(WO93/09222)by Seldenrfa/.; and International ApplicationNo. PCT/US90/06436 
5 (W09 1/06667) by Skoultchi et al, each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred naethods to determine biological functions of the polypeptides of the 
invention in vivo^ one or more genes provided by the invention are either over expressed or 

1 0 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

1 5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

25 replacing the homologous promoter to provide for increased protein expression. The homologous 
promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, eg., homologous recombination or knock out strategies, of animals that fail to express 

30 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vfva, one or more genes provided by the invention are either over expressed or 

35 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
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244:1288-1292 (1989)]. Animals in which the gene is over expressed, xinder the regulatory 
control of exogenous or endogenoiis promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human manmials, can be 
5 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic ammals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

1 0 Publication No. W094/28 1 22, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

1 5 homologous promoter to provide for mcreased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

20 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present mvention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

25 mechanism underlying the particular condition or pathology will dictate whether the 

polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject m need of treatment Thus, **therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(mcluding recombinant DNA molecules, cloned genes and degenerate variants thereof) or 

30 polypeptides of the mvention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of the 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including firagments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

35 indirectly activate or inhibit the polypeptides of the invention (identified, e.g., via drug screening 
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assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that .specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 
5 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

1 0 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

1 5 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PGR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immimization techniques; and as 

20 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (Such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al. Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with ^^1lich binding occurs or to identify inhibitors of 

25 the binding interaction. 

The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 

30 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins mvolved in these binding interactions can also be used to screen for peptide or small 
molecule mhibitors or agonists of the binding interaction. 
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Any or all of these research utilities are capable of being developed into reagent grade or 
kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art. 
References disclosing such methods include without limitation "Molecular Cloning: A 
5 Laboratory Manual", 2d ed.. Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987. 

4.1tt.2 ^fUTRITIONALUSES 

1 0 Polynucleotides and polypeptides of the present invention can also be used as nutritional 

sources or supplements. Such uses include without limitation use as a protein or amino acid 
supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or Uquid preparation, such as in the 

1 5 form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 

4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
20 ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (eitiier mducing or inhibiting) or cell differentiation (eitiier mducing or inhibiting) 
activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibitmg such attributes. Many 

25 protein factors discovered to date, mcluding all known cytokines, have exhibited activity in one 
or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
confirmation of cytokine activity. The activity of tiierapeutic compositions of the present 
mvention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, witiiout lunitation, 32D, DA2, DAIG, TIO, B9, B9/11, BaF3, 

30 MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF-1, Mo7e, CMK, 
HUVEC, and Caco. Therapeutic compositions of tiie invention can be used in tiie following: 

Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan et al. Pub. Greene Publishing 
Associates and Wiley-Interscience (Chapter 3, In vitro assays for Mouse Lymphocyte Function 

35 3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai et al, J. Immunol 137:3494-3500, 
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1986; Bertagnolli et al, J. Immunol 145:1706-1712, 1990; Bertagnolli et ai, Cellular 

Immunology 133:327-341, 1991; Bertagnolli, etal, J. Immunol. 149:3778-3783, 1992; Bowman 

et al., J. Immunol. 152:1756-1761, 1994, 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
5 thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. 

Coligan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement 

of mouse and human interleukin-^, Schreiber, R. D. In Current Protocols in Immunology. J. 

E. e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 
1 0 Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 

include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 

and Interleukin 4, Bottomly, K., Davis; L. S. and Lipsky, P. E. In Current Protocols in 

Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 

1991; deVries etal., J. Exp. Med. 173:1205-1211, 1991; Moreau et al, Nature 336:690-692, 
15 1988; Greenberger etal, Proc. Natl Acad. Set U.S.A. 80:2931-2938, 1983; Measurement of 

mouse and human interleukin 6~Nordan, R. In Current Protocols in Immunology. J. E. 

Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith etal, Proc. Natl 

Aced Sci. U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 1 l-Bennett, F., et al. 

In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6. 1 5. 1 John Wiley and 
20 Sons, Toronto. 1 99 1 ; Measurement of mouse and human Interleukin 9-Ciarletta, A., Giannotti, 

J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. 

Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 

that affect APC-T cell uiteractions as well as direct T-cell effects by measuring proliferation and 
25 cytokine production) include, without limitation, those described in: Current Protocols in 

Immunology. FH hv J. E. ColiMn. A. M. Kmisheek. n. H Mannilifts F. M Shp.vnn.h W 
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cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state vAnch would be useful for re-engineering damaged or diseased tissues, transplantation, 
5 manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 
large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 

10 cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 
for transplantation such as kidney, liver, pancreas (including islet cells), heart and limg. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 

15 specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 
3L), any of the interleukins, recombinant soluble iLr6 receptor fused to IL-^, macrophage 
mflammatory protein 1-alpha (MIP-1 -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic 
fibroblast growth factor (bFGF). 

20 Smce totipotent stem cells can give rise to virtually any mature cell type, expansion of 

these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 

25 the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 
with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926). 

30 Stem cells themselves can be transfected with a polynucleotide of the invention to induce 

autocrine expression of the polypeptide of the invention. This will allow for generation of 
undifferentiated totipotential/pluripotential stem cell Unes that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mKNA to create cDNA libraries and templates for 

35 polymerase chain reaction experiments. These studies would allow for the isolation and 
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identification of differentially expressed genes in stem cell popidations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present mvention 
5 may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 
used to augment or replace cells damaged by ilbess, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation 
of neural cells and for the regeneration of nerve and brain tissue, te, for the treatment of central 
and peripheral nervoxis system diseases and neuropathies, as well as mechanical and traumatic 

1 0 disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 
the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after graftmg or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 

1 5 types. A broadly applicable method of obtaining pure populations of a specific differentiated 
cell type from undifferentiated stem cell populations involves the use of a cell-type specific 
promoter driving a selectable marker. The selectable marker allows only cells of the desired 
type to survive. For example, stem cells can be induced to differentiate into cardiomyocytes 
(Wobus et al. Differentiation, 48: 173-182, (1991); Klug et al, J. Clin. Invest., 98(1): 216-224, 

20 (1998)) or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza 
et a/., Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 
accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

25 In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

exhibits stem cell growth factor activity. Stem cells are isolated firom any one of various cell 
sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al Proc. Natl Acad Sci, U.SA., 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 

30 factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bemsteine^a/.,5/ooi, 77: 2316-2321 (1991). 
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4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 
and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 
biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
5 involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating 
utility, for example, in treating various anemias or for use in conjunction with 
irradiation/chemotherapy to stimxilate the production of erythroid precursors and/or erythroid 
cells; in supporting the growtii and proliferation of myeloid cells such as granulocytes and 

1 0 monocytes/macrophages (i. e., traditional CSF activity) useful, for example, in conjunction with 
chemotherapy to prevent or treat consequent myelo-suppression; in supporting the growth and 
proliferation of megakaryocytes and consequentiy of platelets thereby allowing prevention or 
treatment of various platelet disorders such as thrombocytopenia, and generally for use m place 
of or complimentary to platelet transfusions; and/or Ln supporting the growth and proliferation of 

15 hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore fmd therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without Ihnitation, aplastic anemia and 
paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (/.e., in conjimction with bone marrow 

20 transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 
Suitable assays for proliferation and differentiation of various hematopoietic lines, 
including, e.g., assays which are cited above. 

25 Assays for embryonic stem cell differentiation (which will identify, among others, 

proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al Cellular Biology 15:141-151, 1995; Keller et al^ Molecular 
and Cellular Biology 13:473-486, 1993;McClanahanera/.,5/o(w/81:2903-2915, 1993. 

Assays for stem cell survival and differentiation (which will identify, among others, 

30 proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, In Culture of Hematopoietic Cells. R. L 
Freshney, et al eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al, 
Proc. Natl Acad Sci. USA 89:5907-591 1, 1992; Primitive hematopoietic colony forming cells 
with high proliferative potential, McNiece, and Briddell In CULTURE OF Hematopoietic Cells. 

35 R. I. Freshney, et al eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et al, 
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Experimental Hematology 22 353-359, 1994; Cobblestone areafonning cell assay, Ploemacher, 
In Culture OF Hematopoietic Cells. R. L Freshney, etal eds. Vol pp. 1-21, Wiley-Liss, Inc., 
New York, N.Y. 1994; Long term bone marrow cultures in the presence of stromal cells, 
Spooncer, et al In Culture of Hematopoietic Cells. R. L Freshney, et al eds. Vol pp. 
5 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture initiating cell assay, 
Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. Freshney, et al eds. Vol pp. 
139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

1 0 A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 

ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone growth in 
circumstances where bone is not normally formed, has application in the healing of bone 

1 5 fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open frapture reduction 'and also in the improved fixation of 
artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair 
of congenital, trauma induced, or oncologic resection induced craniofacial defects, and also is 

20 useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 
bone-forming cells/Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 

25 inflammation or processes of tissue destruction (coUagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the conljposition of the 
invention. 

Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
30 other tissue formation in circumstances where such tissue is not normally formed, has 

application in the healing of tendon or ligament tears, deformities and other tendon or ligament 
defects in humans and other animals. Such a preparation employing a tendon/ligament-like 
tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament 
tissue, as well as use in the improved fixation of tendon or ligament to bone or other tissues, and 
35 in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation 
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induced by a composition of the present invention contributes to the repair of congenital, trauma 
induced, or other tendon or ligament defects of other origin, and is also useful in cosmetic plastic 
surgery for attachment or repair of tendons or ligaments. The compositions of the present 
invention may provide environment to attract tendon- or ligament-forming cells, stimulate 
5 growth of tendon- or ligament-forming cells, induce dififerentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo for 
return in vivo to effect tissue repair. The compositions of the invention may also be usefid in the 
treatment of tendonitis, carpal tuimel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 

1 0 well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 
cells and for regeneration of nerve and brain tissue, ie. for the treatment of central and peripheral 
nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 

1 5 composition may be used in the treatment of diseases of the peripheral nervous system, such as 
periphery nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 
accordance with the present invaition include mechanical and traumatic disorders, such as spinal 

20 cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting fix)m chemotherapy or other medical therapies may also be treatable using a 
composition of the invention. 

Compositions of the invention may also be useful to promote better or fester closure of 
non-healing wounds, including without lunitation pressure ulcers, ulcers associated with vascular 

25 insufSciency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (inclxiding, for example, pancreas, liver, intestine, 
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 

30 desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 

* A composition of the present invention may also be useful for gut protection or 
regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting fi'om systemic cytokine damage. 
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A composition of the present invention may also be useful for promoting or inhibitmg 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 

Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-1 12 (Maibach, H. I. and Rovee, D. T., eds.). Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(19.78). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 
A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities, A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 
proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity of NK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral (e.g., 
HIV) as well as bacterial or fimgal infections, or may result firom autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fimgal or other infection may be 
treatable using a protein of the present invention, includmg infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fimgal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be usefiol 
where a boost to the inmiune system generally may be desirable, ie,^ in the treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
rheumatoid arthritis, autoinmiune pulmonary inflammation, Gxaillain-Barre syndrome, ' 
autoimimune thyroiditis, insulm dependent diabetes mellitis, myasthenia gravis, graft-versus-host 
disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be usefiil m the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 

47 



wo 017526W ^^^^ ^^^^^^^^ — - ^ - 

angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 
Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
5 suppression is deshred (mcluding, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 
polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 
models such as the cumulative contact enhancement test (Lastbom et al.^ Toxicology 125: 59-66, 
1998), skin prick test (Hoffinann et a/., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
10 test (Vohr et a/., Arch ToxocoL 73: 501-9), and murine local lymph node assay (Kimber et al, 
J, Toxicol Environ, Health 53: 563-79). 

Using the proteins of the invention it may also be possible to modulate immune 
responses, in a number of ways. Down regulation may be in the form of mhibiting or blocking an 
inunune response already in progress or may involve preventing the induction of an immune 
1 5 response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 
generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent. Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
20 persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 

demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 
of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
25 level lymphokine synthesis by activated T cells, will be useful in sitmMons of tissue, skin and 
organ transplantation and in grafl-versus-host disease (GVHD). For example, blockage of T cell 
function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an unmune reaction that destroys the transplant. The administration of a therapeutic 
30 composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an iminimosuppressant. Moreover, a lack of costimulation may also be sufficient 
to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents, to achieve sufficient unmunosuppression or tolerance in a subject, it 
35 may also be necessary to block the function of a combmation of B lymphocyte antigens. 
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The efficacy of particular therapeutic compositioiis in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of eflScacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
5 the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
al. Science 257:789-792 (1992) and Turka et aU Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
(1992). In addition, murine models of GVHD (see Paul ed., Fundamental Inununology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

1 0 Blocking antigen function may also be therapeutically useful for treating autoinmiune 

diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation of T 

15 cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 
long-term relief from the disease. The eflRcacy of blocking reagents in preventing or alleviating 
autoinmiune disorders can be determined using a number of well-characterized animal models of 

20 human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus erythmatosis in MRL/lprApr mice or NZB hybrid mice, mxirine autoinmiune 
collagen arthritis, diabetes mellitus in NOD mice and BE rats, and murine experimental 
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

25 Upregulation of an antigen function {e.g,, a B lymphocyte antigen function), as a means 

of up regulating immune responses, may also be useful in therapy. Upregulation of immirne 
responses may be in the fonri of enhancing an existing immune response or eliciting an initial 
immune response. For example, enhancing an immune response may be 'usefiil in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and 

30 encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 
removijag T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
35 patient Another method of enhancing anti-viral immune responses would be to isolate infected 
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cells from a patient, transfect them with a nucleic acid encoding a protein of the present 
invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 
5 A polypeptide of the present invention may provide the necessary stimulation signal to T 

cells to induce a T cell mediated immime response against the transfected tumor cells. In 
addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 
reexpress sufficient mounts of MHC class I or MHC class 11 molecules, can be transfected with 
nucleic acid encoding all or a portion of {e.g.y a cytoplasmic-domain truncated portion) of an 

1 0 MHC class I alpha chain protein and p2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class n beta chain protein to thereby express MHC class I or MHC class II 
proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 

15 an antisense construct which blocks expression of an MHC class n associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 
of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 
tumor specific immunity. Thus, the induction of a T cell mediated immime response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject 

20 The activity of a protein of the invention may, among other means, be measured by the 

following methods: 

Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, et al. Pub. 
Greene Publishing Associates and Wiley-Interscience (Chapter 3, In vitro assays for Moxise 

25 Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Herrmaim et a/., 
Proc. Natl Acad. Set USA 78:2488-2492, 1981; Herrmann etal, 1 Immunol 128:1968-1974, 
1982; Handa et aL, J. Immunol 135:1564-1572, 1985; Takai et «/., J, Immunol 137:3494-3500, 
1986;Takaief a/., J. Immunol 140:508-512, 1988; Bowman e/ a/.,/ Virology 6\:\992A99Z\ 
BertagnoUi et al. Cellular Immunology 133:327-341, 1991; Brown et al, 1 Immunol 

30 153:3079-3092, 1994. 

Assays for T-cell-dependent inamunoglobulin responses and isotype switching (which 
will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, 1 
Immunol 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production. 
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Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. 
Vol 1 pp. 3.8.1-3.8.16, John WUey and Sons, Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identily, among otha-s, proteins 
that generate predominantly Thl and CTL responses) include, without limitation, those described 
5 in: Current Protocols in Immunology, Ed by J. E. Coligan, et ah. Pub. Greene Publishing 
Associates and Wiley-Interscience (Chapter 3, In vitro assays for Mouse Lymphocyte Function 
3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai etal, J. Immunol. 137:3494-3500, 
1986;Takaigfa/., J. Immunol 140:508-512, 1988;BertagnolUefa/.,7. Immunol. 
149:3778-3783, 1992. 

1 0 Dendritic cell-dependent assays (which will identify, among others, proteins expressed 

by dendritic cells that activate naive T-cells) include, without limitation, those described in: 
Query et al, J. /wjwuno/. 134:536-544, 1995; Inaba et al, J. Exper Med. 173:549-559, 1991; 
Macatoniafi/ar/., J. Immunology 154:5071-5079, 1995; Porgador etai, J. Exper. Med. 182: 
255-260, 1995; Nair et al., J. Virology 67:4062-4069, 1993; Huang et al.. Science 264:961-965, 

15 1994; Macatonia et al., J. Exper. Med 169:1255-1264, 1989; Bhardwaj et al.,J.Clinical 
Investigation 94:797-807, 1994; and Inaba et al, J. Exper. Med. 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 
that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al. Cytometry 

20 13:795-808, 1992; Gorczyca et al. Leukemia 7:659-670, 1993; Gorczyca et al. Cancer 
Research 53:1945-1951, 1993; Itoh etal. Cell 66:233-243. 1991; Zacharchuk, J. Immunol 
145:40374045, 1990; Zamai et al.. Cytometry 14:891-897, 1993; Gorczyca et al, Int'l 
J.Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 

25 include, witiiout limitation, those described in: Antica et al, Blood 84: 1 1 1 -1 1 7, 1 994; Fine et al. 
Cellular Immunology 155:11 1-122, 1994; Galy etal. Blood 85:2770-2778, 1995; Toki etal, 
Proc. Nat. Acad Sci.USA%%:15A%-155\,\99\. 

4.10.8 ACTIVIN/INfflBIN ACnVlTY 

30 A polypeptide of the present invention may also exhibit actiyinr or inhibin-relatcd 

activities. A polynucleotide of the invention may encode a polypeptide esdiibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of flie present invention, 

3 5 alone or in heterodimers with a member of the inhibm family, may be useful as a contraceptive 
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based on the ability of inhibins to decrease fertility in female mammals and decrease 
spermatogenesis in male mammals. Administration of sufl5cient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin groi^, may be usefiil as 

5 a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from cells of the anterior pituitary. See, for example, U.S. Pat No. 4,798,885. A 
polypeptide of the invention may also be useful for advancement of the onset of fertility in 
sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 

1 0 The activity of a polypeptide of the invention may, among other means, be measured by 

the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: Vale et 
al. Endocrinology 91:562-512, 1972; Ling era/„iVa/wre 321:779-782, 1986; Vale a/., iVa/wre 
321 -me-ni^ 1986; Mason et al. Nature 318:659-663, 1985; Forage et al, Proc. Natl Acad, 

15 Set USA 83:3091-3095, 1986. 



4.10.9 CmMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
activity for manmialian cells, including, for example, monocytes, fibroblasts, neutrophils, 
20 T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 
receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
modulators of the invention) provide particular advantages in treatment of wounds and other 
25 trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 
immime responses against the tumor or infecting agent. 

A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
30 Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used m the following: 
Assays for chemotactic activity (which will identify proteins that induce or prevent 
3 5 chemotaxis) consist of assays that measxire the ability of a protein to induce the migration of 

52 



10 



cells across a membrane as well as Ihe abiUty of a protein to induce the adhesion of one cell 
population to another cell population. Suitable assays for movement and adhesion include, 
without limitation, those described in: Current Protocols in Immunology, Ed by J. E. 
Coligan. et al.. Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 6.12, 
Measurement of alpha and beta Chemokmes 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 
95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 1995; Muller et al Eur. J. Immunol. 
25:1744-1748; Gruber et al. J. Immunol. 152:5860-5867, 1994; Johnston et al. J. Immunol. 
153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Compositions may be usefiil in treatment of various coagulation disorders (mcluding 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
1 5 in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or mhibiting fomiation of liiromboses and for 
treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
20 Assay for hemostatic and thrombolytic activity include, without limitation, those 

described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al.. Thrombosis Res. 
45:413-419. 1987; Humphrey et al, Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474,1988. 

25 4.10.11 CANCERDUGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
metastasis. Detection of the presence or amount of polynucleotides or polypeptides of the 
invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example,the presence or mcreased expression of a polynucleotide/polypeptide of the invention 
30 may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene- or absence of the polypeptide may be associated with a cancer 
condition. Identification of single nucleotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tmnor cell proliferation. 
35 mhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
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and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
S acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
1 0 bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
15 tumor progression of human skin keratinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 
inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. There5)eutic compositions can be administered in therapeutically 
20 effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 

chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clioical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
25 portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of die invention include: Actinomycin D, Aminoglutethimide, 
30 Asparaginase, Bleomycin, Busulfen, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
Daimorubicin HCl, Doxorubicin HCl, Estramustine phosphate sodium, Etoposide (V16-213), 
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, 
Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
35 Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopurine, Mesna, 

54 



Methotrexate (MIX). Mitomycin. Mitoxantrone HCl. Octreotide. Plicamycin. Procarbazine HCl. 

Streptozocin. Tamoxifen citrate, TOoguanine. Iluotepa. Vinblastine sulfate. Vincristine sulfate, 

Amsacrine. Azacitidine. Hexamethyhnelamine. Interleukin.2, Mitoguazone. Pentostatin, 

Semustine. Teiuposide, and Vindesine sulfate. 
5 In addition, therapeutic compositions of the invention may be used for prophylactic 

treatment of cancer. There are hereditary conditions and/or enviromnental situations {e.g. 

exposure to carcinogens) known in the art that predispose an individual to developing cancers. 

Under these circumstances, it may be beneficial to treat these individuals vdth therapeutically 

effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 
10 Ir, vitro models can be used to determine the effective doses of the polypeptide of the 

invention as a potential cancer treatment TTxese in .itro models include proliferation assays of 
cultured tumor cells, gro^vth of cultured tumor cells in soft agar (see Freshney. (1987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21). 
tumor systems in nude mice as described in Giovanella et al, J. Natl. Can. Inst, 52: 921-30 
15 (1974) mobility and invasive potential of tumor cells in Boyden Chamber assays as described m 
PiUdn^on et al.. Anticancer Res., 17: 4107-9 (1997). and angiogenesis assays such as induction 
of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta /«r/. /. />«v. m, 40: 1 1 89-97 (1999) an^ 

Clin. Exp. Metastasis, 17:423-9 (1999). respectively. Suitable tumor cells Unes are available, 
20 e.g. from American Type Tissue Culture Collection catalogs. 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/Ugand interactions. A polynucleotide of the 

25 inventioncanencodeapolypeptideexMbitingsuchcharacteristics.Examplesof 

and ligands mclude. without limitation, cytokine receptors and their Ugands, receptor kinases and 
their ligands, receptor phosphatases and their Ugands, receptors involved in cell-cell interactions 
and their Ugands (including without limitation, ceUular adhesion molecules (such as selectms, 
integrins and their ligands) and receptor/Ugand pairs involved in antigen presentation, antigen 

30 recognition and development of ceUular and humoral immune responses. Receptors and hgands 
are also useful for screening of potential peptide or smaU molecule inhibitors of the relevant 
receptor/Ugand interaction. A protein of the present invention (mcludmg. without limitation, 
fragments of receptors and Ugands) may themselves be useful as inhibitors of receptor/hgand 
interactions. 
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The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

Suitable assays for receptor-ligand activity include without limitation those described in: 
CURRENT Protocols in Immunology, Ed by J. E, Coliganet al. Pub. Greene Publishing 
5 Associates and Wiley- Interscience (Chapter 7.28, Measurement of Cellular Adhesion under 
static conditions 7.28.1- 7.28.22), Takai et al, PMAS. USA 84:6864-6868, 1987; Bierer et al, 
J. Exp. Med. 168:1 145-1156, 1988; Rosenstein et al, J. Exp. Med. 169:149-160 1989; 
Stoltenborg et al, J: Immunol Methods 175:59-68, 1994; Stitt et al. Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor for a 
1 0 ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, afiSnity chromatography, dihybrid screening assays, BIAcore assays, gel 
overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
partial antagonist reqinre the use of other proteins as competing ligands. The polypeptides of the 
1 5 present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimefic molecufeV^ conventiondl^^ 

Purification" Murray P. Deutscher (ed) Methods IN Enzymology Vol. 182 (1990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent 
20 molecules such as fluorescamine, or rhodamine or other colorimetric molecdes. Examples of 
toxms include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

This invention is particularly usefixl for screening chemical compounds by using the 
25 novel polypeptides or binding fi-agments thereof in any of a variety of drug screening techniques. 
The polypeptides or firagments employed in such a test may either be free in solution, affixed to a 
solid support, borne on a cell surface or located intracellularly. One method of drug screenmg 
utilizes eukaryotic or prokaryotic host cells vsiiich are stably transformed with recombinant 
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
30 transformed cells in competitive binding assays. Such. cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of 
complexes between polypeptides of the invention or fragments and the agent being tested or 
examine the dimdnution in complex formation between the novel polypeptides and an 
appropriate cell line, which are well known in the art. 
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Sources for test compounds that may be screened for ability to bind to or modulate (f.e., 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 
5 Chemical libraries may be readily synthesized or purchased from a number of 

commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria and 
fungi), animals, plants or other vegetation, or marine Organisms, and libraries of mixtures for 

10 screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof For a 
review, see Science 252:63-68 (1 998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 

1 5 organic compotmds and can be readily prepared by traditional automated synthesis methods, 
PGR, cloning or proprietary synthetic methods. Of particxilar interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin, 

20 Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al, Mol Biotechnol 9(3):205-23 (1998); Hruby et al, Curr Opin Chem Biol, 
1(1):1 14-19 (1997); Domer et al, BioorgMed Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

Identification of modulators through use of the various libraries deiscribed herein permits 
modification of the candidate "hit" (or "lead'*) to optimize the capacity of the "hit" to bind a 

25 polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known m the 
art. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
for either cell/animal death or prolonged survival of the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin or 

30 cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
complexed with imaging agents for targeting and imaging purposes. 
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4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 
previously xinknown binding partners for receptor polypeptides of the invention. For example, 

5 e^^ression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the mvention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 
of different libraries used for the identification of compounds, and m particular small molecules, 

1 0 that modulate (z, e., mcrease or decrease) biological activity of a polypeptide of the invention. 
Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 
invention whereas the other does not The response of the two cell populations to the addition of 

1 5 ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 
polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify blading partner polypeptides, including, (1) organic and 
inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 

20 comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 
protein, whose ligand has teen identified, is produced in a host cell. The cell is then incubated 

25 with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications /.e. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 

30 4.10.15 ANTI-INFLAMMATORY ACTIVltY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the 
mflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
3 5 process, inhibiting or promoting cell extravasation, or by stunulating or suppressing production 



of Other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions mcluding chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfiision injury, 
5 endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
overproduction of cytokines such as TNF or IL-1. Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
Compositions of this invention may be utilized to prevent or treat conditions such as, but not 

10 limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1 , 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autounmune disease or inflammatory disease,* an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 

15 intrauterine mfections. 

4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of a 
therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
20 invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
^ (granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
Fishman et a/., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

25 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving ceil types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 

30 therapeutic utility, include but are not lunited to nervous system mjuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
limited to the following lesions of either the central (including spinal cord, brain) or peripheral 

35 nervous systems: 
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(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 
5 results in neuronal injury or death, including cerebral infarction or ischemia, or spinal cord 

infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by human 
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme disease, 

10 tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed or 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 
sclerosis; 

1 5 (v) lesions associated with nutritional diseases or disorders, in which a portion of the 

nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Maichiafava-Bignami disease (primary degeneration of the corpus 
callosum), and alcoholic cerebellar degeneration; 

20 (vi) neurological lesions associated with systemic diseases including but not limited to 

diabetes (diabetic neuropathy. Bell's palsy), systeimc lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

25 (viii) demyelinated lesions in which a portion of the nervous system is destroyed or 

mjured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a nervous 

30 system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 

(i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo; 
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(iu) increased production of aneuron-associatedmoleculeincultureor invivo,e.g., 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 
(iv) decreased symptoms of neuron dysfunction in vivo. 
Such effects may be measured by any method kno^vn in the art. In preferred, 
5 non-limiting embodiments, increased survival of neurons may be measured by the method set 
forth in Arakawa et al. (1990. J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth inPestronke/ a/. (1980,£^.W. 70:65-^^^ 
(1981. Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated molecules may 
be me'asuredby bioassay, enzymatic assay, antibody binding. Northern blot assay, etc., 
10 dependmg on the molecule to be measured; and motor neuron dysfunction may be measured by 
assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 
invention include but are not limited to disorders such as infarction, infection, exposure to toxm. 
15 trauma, surgical damage, degenerative disease or maUgnancy thatmay affect motor neurons as 
weU as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotropWc lateral sclerosis, and includingbut not limitedtopro^^^^^ 

muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvemle 
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
20 poUomyelitis and the post poUo syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

4.10.18 OTHER ACTIVITIES 

A polypeptide of the invention may also exhibit one or more of the following additional 
25 activitiesoreffects:inhibitingthegrowtix.infectionorfunctionof,orkilling.irf^^^^ 

including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressmg 
or enhancing) bodily characteristics, including, without limitation, height, weight, han: color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 
(such as. for example, breast augmentation or diminution, change m bone fomi or shape); 
30 effectingbiorhythmsorcircadiancyclesorrhythmsieffectingthefertilityofmaleorfemale 

subjects; effecting the metabolism, cataboUsm, anaboUsm, processmg, utilization, storage or 
elimination of dietary fat. Upid, protein, carbohydmte. vitamms. minerals, co-factors or other 
nutritional factors or component(s); effecting behavioral characteristics, includmg. without 
limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
35 (includingdepressivedisorders)andviolentbehaviors;.providingakalgesiceffectsor 
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reducing effects; promoting diifferentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 
hyperproliferative disorders (such as, for example, psoriasis); immxmoglobulin-like activity (such 
5 as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an inmiune response against such protein or another material or 
entity which is cross-reactive with such protein, 

4.10.19 roENXmCATION OF POLYMORPfflSMS 

1 0 The demonstration of polymorphisms makes possible the identification of such 

polymorphisms in human subjects and the phaimacogenetic use of this infomiation for diagnosis 
and treatment. Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
response) or a differential response to drug administration, and this genetic information can be 

1 5 used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
T ~polymorphism~associated^th a predi^osition to inflammation or autoinimune~disea^^ 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
20 generally involve obtaining a sample firom a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PGR may be used to amplify an appropriate fragment 
of genomic DNA vAdch may then be sequenced. Alternatively, the DNA may be subjected to 
allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 
25 hybridized to the DNA under conditions perxnitting detection of a single base mismatch) or to a 
smgle nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 
enzymes that provide differential digestion of the genomic DNA depending on the presence or 
30 absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention m order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
invention can be placed on the array to detect changes from those sequences. 
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Alternatively a polymorphism resulting in a change in the amino acid sequence could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g., 
by an antibody specific to the variant sequence, 

5 4.10J0 ARTHMTIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model System 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at., 1983, 
Science, 219:56, or by B. Waksman et al, 1963, Int Arch Allergy AppL Immunol^ 23:129. 

10 Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the .base of the tail 
mixture. The polypeptide is adminisitered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

1 5 The procedure for testing the effects of the test compound would consist of intradermally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 18, 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 

20 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (includmg polypeptide fragments, analogs, variants and antibodies or 
25 other binding partners or modulators includiag antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.1U EXAMPLE 

30 One embodiment of the invention is the administration of an effective amoimt of the 

polypeptides or other composition of the invention to iadividuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 
exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 

35 polypeptides or other composition of the invention will normally be determined by the 
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prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about 0.01 [ig/kg to 100 mg/kg of body weight, with 
the prefened dose being about 0. 1 [ig/kg to 1 0 mg/kg of patient body weight. For parenteral 
5 administration, polypeptides of the invention will be formulated in an injectable form combined 
with a phannaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albxmiin. The vehicle may contain minor amounts of 
additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
1 0 The preparation of such solutions is within the skill of the art 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source derived, 

1 5 including without limitation from recombinant and non-recombinant sources and including 

antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 
may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 

20 fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic inaterial that does not interfere with the 
effectiveness of the biological activity of the active ingredient(s). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 
invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

25 M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, E^5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-ll, IL-12, 
IL-13, IL-I4, IL-15, IFN, TNFO, TNFl, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 
include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 

30 factor (PDQF), transforming growth factors (TGF-a and TGF-p), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which cither enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
treatment. Such additional factors and/or agents may be included in the pharmaceutical 

35 composition to produce a synergistic effect with protein or other active ingredient of the 
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. invention, or to niinimize side effects. Conversely, protein or other active ingredient of the 
present mvention may be included in formulations of the particular clotting fector, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
5 hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-lRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). A protein 
of the present invention may be active in multimers (e.g., heterodimers orhomodimeis) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
invention may comprise a protein of the invention in such multimeric or complexed form. 

10 As an alternative to being included in a pharmaceutical composition of the invention 

including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e,g,, at the same time, or at differmg times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
Techniques for formulation and administration of the compounds of the instant application may 

1 5 be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA, 
latest edition. A therapeutically effective dose further refers to that amount of the compound 
sufficient to result in amelioration of symptoms, e.g., treatment, healing, prevention or 
amelioration of the relevant medical condition, or an increase in rate of treatment, healing, 
prevention or amelioration of such conditions. When applied to an individual active ingredient, 

20 administered alone, a therapeutically effective dose refers to that ingredient alone. When applied 
to a combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

In practicing the method of treatment or use of the present invention, a therapeutically 
25 effective-amount of protein or other active ingredient of the present invention is administered to 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 
hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
30 hematopoietic factors, protein or other active ingredient of the present invention may be 

administered either simultaneously mHi the cytokine(s), lymphokine(s), other hematopoietic 
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
active ingredient of the present invention in combination with cytokine(3), lymphokine(s), other 
35 hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 
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4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of adininistration may, for example, include oral, rectal, transmucosal, or 
intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
5 intramedullary injections, as well as intrathecal, direct intraventricular, intravenoiis, 

intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 

1 0 or intravenous injection. Intravenoiis administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic maimer, for 
example, via injection of the compound direcfly into a arthritic joints or in fibrotic tissue, often 
in a depot or sustained release formulation. In order to prevent the scarring process frequently 
occurring as complication of glaucoma surgery, the compounds may be administered topically, 

1 5 for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targetmg, for example, 
arthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afQicted tissue. 

The polypeptides of the invention are administered by any route that delivers an effective 
20 dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in tiie art. Preferably for 
wound treatment, one administers the therapeutic compound directiy to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or firom 
similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
25 clinician to provide maximal therapeutic benefit. 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
be formulated in a conventional manner using one or more physiologically acceptable carriers 
30 comprising excipients and auxiliaries which facilitate processing of the active compounds into 
preparations which can be used pharmaceutically. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
lyophilizing processes. Proper formulation is dependent upon the route of administration 
35 chosen. When a therapeutically eflfective amount of protein or other active ingredient of the 
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present invention is administered orally, protein or other active ingredient of the present 
invention will be in the form of a tablet, capsule, powder, solution or elixir. When administered 
in tablet form, the pharmaceutical composition of the invention may additionally contain a solid 
carrier such as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 
5 95% protein or other active ingredient of the present invention, and preferably from about 25 to 
90% protein or other active ingredient of the present invention. When administered in liquid 
form, a liquid carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, 
mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
pharmaceutical composition may further contain physiological saline solution^ dextrose or other 

1 0 saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of the 

1 5 present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
the skill in the art. A preferred pharmaceutical composition for intravenous, cutaneous, or 

20 subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art The pharmaceutical composition of the present invention may 
also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 

25 skill in the art. For injection, the agents of the invention may be formulated in aqueous 

solutions, preferably in physiologically compatible buffers such as Hanks's solution, Ringer^s 
solution, or physiological saline buffer. Fortransmucosal administration, penetrants appropriate 
to the barrier to be permeated are used in the formulation. Such penetrants are generally known 
in the art, 

30 For oral administration, the compounds can be formulated readily by combining the 

active compounds with pharmaceutically acceptable carriers well known in the art. Such carriers 
enable the compounds of the invention to be formulated as tablets, pills, dmgees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 

35 optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
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suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
gum tragacanth, methyl cellulose, hydroxypropyhnethyl-cellulose, sodium 

5 carboxymethylcellulose, and/or polyvinylpyrrolidone (P VP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrroUdone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arable, 
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 

10 solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may be 

added to the tablets or dragee coatings for identijHcation or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed c^sules made of gelatin and a plasticizer, such as glycerol or 

1 5 sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactoserbinderTsuch as stoches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
stabilizers may be added. All formulations for oral administmtion should be in dosages suitable 

20 for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are convenientiy delivered in the foim of an aerosol spray presentation firom 
pressurized packs or a nebuliser, with the xise of a suitable propellant, e.g,, 

25 dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount. Capsules and cartridges of, e.g., gelatin for use 
in an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 

30 administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
stabilizing and/or dispersing agents. 
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Pharmaceutical fonndations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
5 triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 

10 suitable vehicle, e.g., sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation Such long acting formulations may be administered by 

1 5 implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compoimds of the invention is a co-solvent 

20 system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1 :1 with a 5% dextrose in water solution. This co-solvent 

25 system dissolves hydrophobic compoimds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 

30 biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
sugars or polysaccharides may substitute for dextrose. Altematively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usxially at the cost of greater toxicity. 

35 Additionally, the compounds may be delivered using a sustained-release system, such as 
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semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 
5 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

10 polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 
acceptable base addition salts are those salts which retain the biological effectiveness and 
properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

1 5 monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate, triethanol amine and 
thelikef: 

The pharmaceutical composition of the invention may be in the form of a complex of the 
protein(s) or other active ingredient(s) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

20 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class I and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 

25 MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 

30 which protein of the present invention is combined, in addition to other pharmaceutically 
acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
lipids for liposomal formulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 

35 liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
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Patent Nos. 4^35.871; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein hy reference. 

The amoirnt of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 

5 the condition being treated, and on the nature of prior treatments which the patient has 

undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or other active ingredient 

10 of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further. It is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 fig to about 100 mg (preferably about 0.1 |xg to about 10 mg, more preferably 
about 0.1 \ig to about 1 mg) of protein or other active ingredient of the present invention per kg 

1 5 body weight For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyrogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 

20 delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 

25 cartilage formation, the composition would include a matrix capable of delivering the 

protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providmg a stnicture for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical appUcations. 

30 - The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interfece properties. The particular application of the 
compositions will deiSne the appropriate formxilation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 

35 are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
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matrices are comprised of pure proteins or extracellxilar matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
alximinates, or other ceramics. Matrices may be comprised of combinations of any of the above 
mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 

5 tricalcium phosphate. The biocerandcs may be altered in composition, such as in 

calcium-aluminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic add in the form of porous particles having diameters ranging from 150 to 800 microns. 
In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 

1 0 ceDulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A prefeired family of sequestering agents is cellulosic materials such as alkylcelluloses 
(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 

1 5 carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC)."Other'pre^^^ include'fiySuronic"^^ alginate, 

poly(ethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 
total formulation weight, which represents the amoxmt necessary to prevent desorption of the 

20 protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells/ In fiirther 
compositions, proteins or other active ingredients of the invention may be combined with other 
agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 

25 question. These agents include various growth fectors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transformmg growth factors (TGF-a and TGF-p), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 
Particularly domestic animals and thoroughbred horses, in addition to hiraians, are desired 

30 patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e.g., amount of tissue weight desired to be formed, tiie site of 
damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 

35 bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 



Other clinical factors. The dosage may vary with the type of matrix used in the reconstitution 
and with inclusion of other proteins in the pharmaceutical composition. For example, the 
addition of other known growth factors, such as IGF I (insulin like growth factor I), to the final 
composition, may also effect the dosage. Progress can be monitored by periodic assessment of 
5 tissue/bone growth and/or repair, for example. X-rays, histomorphometric determinations and 
tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other known 
10 methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

15 4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amoimt 
effective to prevent development of or to alleviate the existing symptoms of the subject being 

20 treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that can be used to more accurately determine useful doses in 

25 , humans. For example, a dose can be formulated in animal models to achieve a circulating 

concentration range that includes the IC50 as determined in cell culture (/,e,, the concentration of 
the test compound wiiich achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine usefixl doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results in 

30 amelioration of symptoms or a prolongation of survival in a patient. Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in cell 
cultures or experimental animals, e.g.^ for determining the LD50 (the dose lethal to 50% of the 
population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 

35 ratio between LD50 and ED50. Compounds which exhibit high therapeutic indices are preferred. 
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The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in hiunan. The dosage of such compounds lies preferably within a range 
of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 
vary within this range depending upon the dosage form employed and the route of administration 

5 utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient's condition. See, eg., Fingl et al, 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount and interval may be adjusted 
individually to provide plasma levels of the active moiety which are sufBcient to maintain the 
desired effects, or minimal effective concentration (MEC), The MEC will vary for each 

1 0 compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should be 
administered using a regimen which maintains plasma levels above the MEC for 10-90% of the 

1 5 time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
aduinistration or sdactive uptake, the effective local concentration of tfie~drug may not be 
related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the invention 
will be in the range of about 0.01 ^ig/kg to 100 mg/kg of body weight daily, with the preferred 

20 dose being about 0. 1 M-g/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

The amount of composition administered will, of course, be dependent on the subject 
being treated, on the subject's age and weight, the severity of the affliction, the manner of 
25 administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more imit dosage forms containing the active ingredient. The pack may, for 
30 example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device 
may be accompanied by instructions for administration. Compositions comprising a compoimd 
of the invention formulated in a compatible pharmaceutical carrier may also be prepared, placed 
in an appropriate container, and labeled for treatment of an indicated condition. 
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4.13 A^mBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 
invention. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules. i.e., molecules that contain 
an antigen binding site that specifically bmds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chuneric, single cham, Fab, Fab- and F(ab^2 
fragments, and an Fab expression library. In general, an antibody molecule obtained from 
humans relates to any of the classes IgG. IgM. IgA. IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi. IgG2, and others. Furthemiore, in humans, the Ught chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an antigen, or a 
portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The M-length protein can be used or. alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of ihe amino acid sequence 
of the fixU length protein, such as an amino acid sequence shown in SEQ ID NO: 2 or 3, and 
encompasses an epitope thereof such that an antibody raised against the peptide forms a specific 
inunune complex with the full length protein or with any fragment that contains the epitope. 
Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino 
acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 
epitopes encompassed by the antigenic peptide are regions of the protein friat are located on its 
25 surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein. ..g., a 
hydrophiUcregion.Ahydrophobicityanalysisofthe human related protein sequenced 

indicate which regions of a related protein are particularly hydrophUic and, therefore, are likely 
30 to encode surface residues usefiil for targeting antibody production. As a means for targeting 
antibody production, hydropathy plots showing regions of hydrophUicity and hydrophobicity 
may be generated by any method well known in the art, including, for example, the Kyte 
DooUttle or the Hopp Woods methods, either with or without Fourier transformation. See, e.g., 
Hopp and Woods, 1981, Proc Nat. Acad. ScL USA 78: 3824-3828; Kyte and Doolittle 1982, J. 
35 Mol Biol 157: 105-142. each of which is mcorporated herein by reference in its entirety. 
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Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
5 immimospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A 
LABORATORY MANUAL, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, 
1 0 Cold Spring Harbor, NY, incorporated herein by reference). Some of these antibodies are 
discussed below. 



5.13.1 Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 

1 5 goat, mouse or other mammal) may be mimunized by one or more injections with the native 
^teinTa^Tothrtic^v^S^^ derivSivVof "theToregoin^^ appropriate 

immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 
recombinantly expressed immunogenic protein. Furthermore, the protein may be conjugated to 

20 a second protein known to be immunogenic in the mammal being inmiunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, seruni albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 

25 active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etcX adjuvants usable in humans such as Bacille Calmette-Ouerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). 

30 The polyclonal antibody molecules directed against the immunogenic protein can be 

isolated from the mammal (eg., from the blood) and further purified by well known techniques, 
such as affinity chromatography usmg protein A or protein G, which provide primarily the IgG 
fraction of hiunune serum. Subsequently, or alternatively, the specific antigen which is the 
target of the irmnxmoglobulin sought, or an epitope thereof, may be immobilized on a column to 

35 purify the immime specific antibody by inomunoaffinity chromatography. Purification of 
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immimoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

5.13^ Monoclonal Antibodies 

5 The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 

herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a xmique heavy chain 
gene product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 

1 0 binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature^ 256; 495 (1975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 

1 5 elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 
The immunizing agent will typically include the protein antigen, a fi-agment thereof or a fusion 
protem thereof Generally, either peripheral blood lymphocytes are used if cells of human origin 
are desired, or spleen cells or lymph node cells are used if non-himian mammalian sources are 

20 desired. The lymphocytes are then iEused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Coding, (1986) Monoclonal 
Antibodies: Principles and Practice, Academic Press, pp. 59-103). Immortalized cell lines 
are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and 
human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can 

25 be cultured in a suitable culture medium that preferably contains one or more substances that 
inhibit the growth or survival of the unfused, immortalized cells. I^or example, if the parental 
cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the 
culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 
thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 

30 Preferred immortalized cell lines are those that fuse efiBciently, support stable high level 

expression of antibody by the selected antibody-producing cells, and are sensitiye to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, jfrom the Salk Institute Cell Distribution Center, San Diego, 
California and the American Type Culture Collection, Manassas, Virginia, Human myeloma and 

3 5 mouse-human heteromy eloma cell lines also have been described for the production of human 
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monoclonal antibodies (Kozbor, J. Immunol, 133: 3001 (1984); Brodeur etaU Monoclonal 
ANTffiODY Production Techniques and Applications, Marcel Dekker, Inc., New York, 
(1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed for 

5 the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
inmiunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding aflSnity of the monoclonal antibody can, for example, be determined by the 

10 Scatchard analysis of Munson and Pollard, Anal Biochem., 107: 220 (1980)- Preferably, 

antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. , 

After the desired hybridoma cells are identified, the clones can be subcloned by luniting 
dilution procedures and grown by standard methods. Suitable culture media for this purpose 

1 5 include, for example, Dulbecco's Modified Eaglets Medium and RPMI-1640 medium. 
Altemalively^ the Eyfeidoma ceUscai^^be^^rowi^ in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified from the 
culture medium or ascites fluid by conventional immunoglobulin purification procedures such 
as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, 

20 dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encodmg the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 

25 light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into ejqjression vectors, which are 
then transfected mto host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce inamunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 

30 example, by substituting the coding sequence for human heavy and light cham constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368: 
812-13 (1994)) or by covalently jouiing to the inmaunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
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be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

5 The antibodies directed against the protein antigens of the invention can further comprise 

himianized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or other antigen- 

1 0 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al, 
Nature, 321 :522-525 (1986); Riechmann et al. Nature, 332:323-327 (1988); Verhoeyen et al. 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

1 5 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are foimd neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially aU of at least one, and typically two, variable 

20 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al, 1986; Riechmann et al, 1988; and Presta, Curr Op. Struct Biol, 

25 2:593-596(1992)). 

5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light cham and the heavy chain, including the CDRs, arise from human 
30 genes. Such antibodies are termed "human antibodies", or "fiilly human antibodies" herein. 
Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, etal, 19%^ Immunol Today A\ 72) M the EB V hybridoma 
technique to produce human monoclonal antibodies (see Cole, et al, 1985 In: Monoclonal 
ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 
35 antibodies may be utilized in the practice of the present mvention and may be produced by using 
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human hybridomas (see Cote, et al, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 
5 mcluding phage display libraries (Hoogenboom and Winter, J. Mol BioL, 227:381 (1991); 
Marks et al, 1 Mol Biol., 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.^., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 

10 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and m Marks etal {Bio/Technology 10: 779-783 (1992)); Lonbeig etal 
(Nature 368: 856-859 (1994)); Morrison ( ^ir/wre 368: 812-13 (1994)); Fishwild etaI,(Nature 
Biotechnology 14: 845-51 (1996)); Neuberger (Nature Biotechnology 14: 826 (1996)); and 

1 5 Lonberg and Huszar (Intern. Rev. Immunol 13 : 65-93 (1995)). 

HurniTarSibodies may additionally^be producedlising transgenic nonhuman animals 
which are modified so as to produce fully human antibodies rather than the animal's endogenous 
antibodies m response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobiilin chains in the nonhuman host ' 

20 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

25 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B 
cells which secrete fully human immunoglobulins. The antibodies can be obtained directly firom 
the animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively fix)m immortalized B cells derived fix)m the arumal, such as 

30 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 

immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producirxg a nonhuman host, exemplified as a mouse, lacking 

35 expression of an endogenous immimoglobulin heavy chain is disclosed in U.S. Patent No. 
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5,939,598, It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged irrmiunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
5 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one manmialian host cell in culture, 

1 0 introducing an expression vector containing a nucleotide sequence encoding a light chain into . 
another mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell 
expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 

1 5 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
20 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No, 4,946,778). 
In addition, methods can be adapted for the construction of Fab expression libraries (see e.g., 
. Huse, et aL, 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal Fab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof Antibody fragments that contain the idiotypes to a protein antigen 
25 may be produced by techniques known in the art including, but not limited to: (i) an F(ab')2 

fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab fragment generated 
by reducing the disulfide bridges of an F(ab')2 fragment; (iii) an Fab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) Fy fragments. 

30 5.13.5 Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 
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Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nattdre, 305:537-539 (1 983)). Because of the random 
5 assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture of ten different antibody molecules, of which only one has the correct 
bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, andinTraunecker e/^?/,, 1991 EMBOJ., 10:3655-3659. 

1 0 Antibody variable domains with the desired binding specificities (antibody-antigen 

combining sites) can be fiised to inmiunoglobulin constant domain sequences. The fiision 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fiisions. 

1 5 DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al. Methods in Emymology, 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 

20 of antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The prefened interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (eg. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 

25 chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as fidl length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 

30 fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan etal. Science 229:81 (1985) describe a procedure 
wherein mtact antibodies are proteolytically cleaved to generate F(ab')2 firagments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 

35 generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
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derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. Hie bispecific antibodies produced can be used as agents for the selective 

immobilization of enzymes. 
5 Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

coupled to fomi bispecific antibodies. Sh^.hy et al, J. BcpMed. 175:217-225 (1992) describe 
the production of a folly humanized bispecific antibody F(ab')2 molecule. Each Fab' fragment 
was separately secreted from E. coU and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
10 overexpressing the ErbB2 receptor and normal human T cells, as weU as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directiy from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
producedusmgleucinezippers. YiosXelny etal, J. Immunol. 148(5):1547.1553 (1992). The 
15 leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fiision. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. IWs method can 
also be utUized for the production of antibody homodimers. The "diabody" technology 
describedbyHoUinger e/a/..Proa Natl Acad. Sd. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific mitibody fragments. The fragments comprise a 
heavy-chain variable domain (Vh) comiected to a light-chain variable domain (VO by a hnker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the Vh and Vl domains of one fragment are forced to pair with the complementary Vl and Vh 
domains of another fragment, thereby formmg two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, GruberfiM/.,/ /w/n"«on52:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared, Tutt et al, J. Immunol 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
30 originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 

immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
• a leukocyte such as a T-cell receptor molecule {e.g. CD2, CD3, CD28, or B7). or Fc receptors for 
IgG (FcyR). such as FcyRI (CD64), FcyWI (CD32) and FcyRIH (CD16) so as to focus cellular 
defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
35 beusedtodirectcytotoxicagentstocellswhichexpressaparticularantigen. These antibodies 
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possess an antigeB-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (TF). 

S 5.13.6 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unv^^ted cells (U,S. Patent 
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
10 It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by formmg a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No, 4,676,980. 

5:13:7 Effector Function^nginee]^ 

It can be desirable to modify the antibody of the invention with respect to eflfector function, so as 
to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 

20 formation in this region. The homodimeric antibody thus generated can have improved 
internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al, J. Exp Med., 176: 1 191-1 195 (1992) 
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 

25 et ai Cancer Research, 53: 2560-2565 (1993), Alternatively, an antibody can be engineered that 
has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson etal, Anti-Cancer Drag Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugafes 
30 The mvention also pertains to immunoconjugates comprising an antibody conjugated to a 

cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate), 

Chemotherapeutic agents xisefiil in the generation of such immunoconjugates have been 
35 described above. Enzymatically active toxins and firagments thereof that can be \ised include 
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diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, cretin, sapaonaria officinalis inhibitor, gelonin, 
5 mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

radionuclides are available for the production of radioconjugated antibodies. Examples include 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 

10 iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compoimds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazomumbenzoyl)-ethylenediamme), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l,5-difluoro-2,4-dmitrobenzene). For example, a 

15 ricin immunotoxin can be prepared as described in Vitetta et aL^ Science^ 238: 1098 (1987). 
Carbon-14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
20 streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "ligand" (e.^., avidin) that is in turn 
conjugated to a cytotoxic agent. 

25 4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 

30 magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled 
artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, "recorded" refers to a process for 

35 storing information on computer readable medium. A skilled artisan can readily adopt any of the 
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presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medixmi having recorded thereon a nucleotide sequence of the present 
5 invention. The choice of the data storage structure will generally be based on the means chosen 
to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in conmiercially-available software such as WordPerfect and Microsoft Word, or 

1 0 represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring 
formats {e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1 or a representative fragment 

1 5 thereof; or a nucleotide sequence at least 95% identical to any of the nucleotide sequences of 
SEQ ID NO : lin con^tefreadaBliTfoiS^^ 

information for a variety of purposes. Computer software is publicly available which allows a 
skilled artisan to access sequence information provided in a computer readable medium. The 
examples which follow demonstrate how software which implements the BLAST (Altschul et 

20 fl/., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutiag et aU Comp. Chem. 17:203-207 
(1993)) search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may be 
useful in producing commercially important proteins such as enzymes used in fermentation 
reactions and in the production of commercially useftil metabolites. 

25 As used herein, "a computer-based system" refers to the hardware means, software 

means, and data storage means used to analyze the nucleotide sequence information of the 
present invention. The minimum hardware means of the computer-based systems of the present 
invention comprises a central processmg unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currentiy available 

30 computer-based systems are suitable for use in the present invention. As stated above, the • 
computer-based systems of the present invention comprise a data storage means having stored 
therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 
means" refers to memory ^^^ch can store nucleotide sequence information of the present 



86 



invention, or a memory access means which can access manufactures having recorded thereon 
the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 

5 sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 
motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 

10 Smith-Waterman, MacPattem (EMBL), BLASTN and BLASTA (NPOLYPEPTIDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 

1 5 readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occunence in the database. The most preferred sequence length of a target 
sequence is from about 10 to 300 amino acids, more preferably from about 30 to 100 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 

20 shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally 
selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
25 enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protem binding 
sequences). 

4.15 TRIPLE HELIX FORMATION 

30 In addition, the fragments of the present invention, as broadly described, can be used to 

control gene expression through triple helix formation or antisense DNA or RNA, both of which 
methods are based on the binding of a polynucleotide sequence to DNA or RNA. 
Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 

35 Lee et a/., Nucl. Acids Res. 6:3073 (1979); Cooney et ah. Science 15241 :456 (1988); and 
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Dervan et al. Science 251 :1360 (1991)) or to the mKNA itself (antisense - Olnino, J. 
Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, 
CRC Press, Boca Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of 
RNA transcription from DNA, while antisense RNA hybridization blocks translation of an 
5 mRNA molecule into polypeptide. Both techniques have been demonstrated to be effective in 
model systems. Information contained in the sequences of the present invention is necessary for 
the design of an antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

1 0 The present invention further provides methods to identify the presence or expression of 

one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 
acid probe or antibodies of the present invention, optionally conjugated or othenvise associated 
with a suitable label. 

In general, methods for detectmg a polynucleotide of the invention can comprise 

1 5 contacting a sample with a compound that binds to and forms a complex with the polynucleotide 

" foFa'pimod sifficirat tofom complex; so"tKat"if a'complex is 

detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention imder such conditions, and amplifying annealed 

20 polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise contacting 
a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 

25 polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 

30 Incubation conditions depend on the format employed in the assay, the detection methods 

employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 
skilled in ihe art will recognize that any one of the commonly available hybridization, 
amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 

35 T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science 
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Publishers, Amsterdam, The Netherlands (1986); Bullock, G.R. et al. Techniques in 
IMMUNOCYTOCHEMISTRY, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 
(1985); Tijssen, P., Practice and Theory of immunoassays: Laboratory Techniques in 
Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
5 (1985). The test samples of the present mvention include cells, protein or membrane extracts of 
cells, or biological fluids such as sputum, blood, serum, plasma, or urine. The test sample used 
in the above-described method will vary based on the assay format, nature of the detection 
method and the tissues, cells or extracts used as the sample to be assayed. Methods for preparing 
protein extracts or membrane extracts of cells are well known in the art and can be readily be 

1 0 adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 

1 5 invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in separate 
containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents firom one compartment to 

20 another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion fi-om one 
compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 

25 contain the reagents used to detect the bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 
primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 

30 established kit formats which are well known in the art. 

U7 MEDICAL IMAGmC 

The novel polypeptides and binding partners of the invention are useful in medical 
unaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
35 invention is involved in the inmiune response, for imaging sites of inflammation or infection). 
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See, e.g., Kunkel etal, U.S. Pat. NO. 5,413,778. Such methods involve chemical attachment of 
a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 
pharaiaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

5 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the mvention, the present invention 
further provides methods of obtaining and identifying agents which bind to a polypeptide 
encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO: 1 , 
1 0 or bind to a specific domam of the polypeptide encoded by the nucleic acid. In detail, said 
method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the present 
invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
1 5 In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a porynucleotide of 
the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 
the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 
to a polynucleotide of the invention is identified. 

20 Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compoimd with a polypeptide of the 
invention for a time sufficient to form a polypeptide/compound complex, and detecting the 
complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 
polynucleotide of the invention is identified. 

25 Methods for identifying compounds that bind to a polypeptide of the mvention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 
sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 
receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 
sequence expression, so that if a polypeptide/compound complex is detected, a compound that 

30 binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
activity observed in the absence of the compound). Altematively, compounds identified via such 
methods can include compoxmds which modulate the expression of a polynucleotide of the 

35 mvention (that is, increase or decrease expression relative to expression levels observed in the 
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absence of the compound). Compounds, such as compounds identified via the methods of the 
invention, can be tested using standard assays well known to those of skiU in the art for their 
ability to modulate activity/ejipression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
•and screened at random or rationally selected or designed using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, phamiaceutical agents and 
the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationaUy selected or designed" when the agent is chosen 
based on the configuration of the particular protein. For example, one skilled m the art can 
readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et al, AppUcation of Syntiietic Peptides: Antisense 
1 5 Peptides," In Syktheto PEraDES. A User's Guide. W.H. Freeman. NY (1992). pp. 289-307, 
and Kaspczak et al, Biochemistry 28:9230-8 (1989). or pharmaceutical agents, or the like. 

In addition to tiie foregoing, one class of agents of the present mvention, as broadly 
described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or . 
20 rationaUy designed/selected. . Targeting the ORF or EMF allows a skilled artisan to design 

sequence specific or element specific agents, modulating tiie expression of eitiier a single ORF or 
multiple ORFs which rely on tiie same EMF for expression control. One class of DNA binding 
agents are agents which contam base residues which hybridize or form a triple helix formation 
by binding to DNA or KNA. Such agents can be based on the classic phosphodiester, 
25 ribonucleic acid backbone, or can be a variety of sulfliydryl or polymeric derivatives which have 

base attachment capacity. 

Agents suitable for use in these metiiods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et a/., Nucl. Acids Res. 6:3073 (1979); Cooney et al, Scier^e 241 :456 (1988); and Dervmi et 
30 d.. Science 251:1360 (1991)) or to tiie mRNA itself (antisense - Okano, /. murochem. 56:560 
(1991); OLIGODEOXYNUCLBO-nDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION. CRC Press. 
BocaRaton,FL(1988)). Triple heUx-formation optimally results in a shut-off of RN A ■ 
transcription from DNA, while antisense RN A hybridization blocks translation of an mRNA 
molecule into polypeptide. Botii techniques have been demonstrated to be effective in model 
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systems. Information contained in the sequences of the present invention is necessary for the 
design of an antisense or triple helix oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present invention can 
be used as a diagnostic agent. Agents which bind to a protein encoded by one of the ORFs of the 
5 present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

. 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
1 0 hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived firom any of the nucleotide 
sequences SEQ ID N0:1. Because the corresponding gene is only expressed in a limited number 
of tissues, a hybridization probe derived firom of any of the nucleotide sequences SEQ ID NO: 1 
can be used as an indicator of the presence of KNA of cell type of such a tissue in a sample. 
1 5 Any suitable hybridization technique can be employed, such as, for example, in situ 

^bndizaaohT PCRas'describ^eTin US7atentsW^ 

additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PGR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 

20 degenerate pool of possible sequences for identification of closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include the 
cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
are known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 

25 polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences 
may be used to construct hybridization probes for mapping their respective genomic sequences. 
The nucleotide sequence provided herein may be mapped to a chromosome or specific regions of 
a chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 

30 hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 
Fluorescent in situ hybridization of chromosomal preparations and other physical 

35 chromosome mapping techniques may be correlated with additional genetic map data. Examples 
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CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link an oligonucleotide to 
CovaLink NH via anphosphoramidatebond, the oligonucleotide terminus mxist have a 5*-end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 

5 then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) and 
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M l-methylimidazole, 
pH7.0(l-MeIm7),isthenaddedtoafinalconcentrationof lOmM l-Mehny. Ass DNA solution is 
then dispensed into CovaLink NH strips (75 ul/well) standing on ice. 

10 Carbodiimide 0.2 M 1 -ethyl-3-(3-dimethylaminopropyl)-carbodiunide (EDC), dissolved in 

1 0 mM l-Melmy, is made fresh and 25 ul added per well The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

15 It is contemplated that a fiirther suitable method for use with the present invention is that 

described in PCT Pat«5t7^pplica^^ 

reference. This method of preparing an oUgonucleotide bound to a support involves attaching a 
nucleoside 3 -reagent through the phosphate group by a covalentphosphodiesterlink to aliphatic 
hydroxyl groups carried by the support The oligonucleotide is then synthesized on the supported 

20 nucleoside and protecting groups removed firom the synthetic oligonucleotide chain under standard 
conditions that do not cleave the oligonucleotide from the support. Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 

25 employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 
Fodorera/. (1 99 l)iSt/e«ce 25 1(4995) 767-73, incoiporated herein by reference. Probes may also 
be immobilizedon nylon supports as describedby Van Ness era/. (1991) Nucleic Acids Res. 19(12) 
3345-50;orlinkedtoTeflonusingthemethodof Duncan & Cavalier (1988) i4na/. Biochem. 169(1) 
104-8; all references being specifically incorporated herein. 

30 To link an oligonucleotideto a nylon support, as described by Van Ness et al (1991), 

requires activation of the nylon surface via alkylation and selective activation of the 5*-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support boimd oligonucleotidesis to utilize the 
light-generated synthesis describedby Pease e/ a/., {\99A)PNASUSA 91(11) 5022-6, incorporated 

35 herein by reference). -These authors used current photolithographic techniques to generate arrays of 
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immobilized oligonucleotideprobes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolabile 
5 -protectediV-acyl-deoxynucleosidephosphoramidites, surface linker chemistry and versatile 
combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotide probes may be 
5 generated in this manner. 

4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 
The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and RNA, 
including mRNA without any amplification steps. For example, Sambrook era/. (1989) describes 
1 0 three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9.14-9.23). 

DNA fragments may be prepared as clones in M 1 3 , plasmid or lambda vectors and/or 
prepared directly &om genomic DNA or cDNA by PGR or other amplificationmethods. Samples 
may be prepared or dispensed in multiwell plates. About 1 00-1000 ng of DNA samples may be 
1 5 prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the metfiods known to those of skill 
in the art including, for example, using restriction enzymes as described at 9,24-9.28 of Sambrook et 
al (1989), shearing by ultrasound and NaOH treatment 

Low pressure shearing is also appropriate, as described by Schriefer et al (1 990) Nucleic 
20 AcidsRes. 18(24) 7455-6, incorporatedherein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of 
these studies indicate that low-pressure shearing is a usefiil alternative to sonic and en2ymatic DNA 
fragmentationmethods. 

25 One particularly suitable way for fragmenting DNA is contemplated to be that using the two 

base recognition endonuclease,CvzJI, described by Fitzgerald e/ at (1992) Nucleic Acids Res. 20 
375 3-62. These authors described an approach for the rapid fragmentation and fractionation of 
DNA mto particular sizes that they contemplated to be suitable for shotgun cloning and sequencing. 
The restriction endonuclease Cvz JI normally cleaves the recognition sequence PuGCPy 

3 0 between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 
this enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments fomi the small 
moleculepUC19 (2688 base pairs). Fitzgerald a/. (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a Cvf JI* * digest of pUCl 9 tiiat was size 
fractionatedby a rapid gel filtration method and direcdy ligated, without end repair, to a lac Z minus 
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Ml 3 cloning vector. Sequence analysis of 76 clones showed that Cv/JI* * restricts pyGCPy and 
PuGCPu, m addition to PuGCPy sites, and that new sequence data is accumidated at a rate 
consistent with random fragmentation* 

As reported in the literature, advantages of this approach compared to sonication and 
5 agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 
electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
importantto denature the DNA to give single stranded pieces available for hybridization. This is 
1 0 achieved by incubating the DNA solution for 2-5 mmutes at 80-90*C. The solution is then cooled 
quickly to 2^C to prevent renaturation of the DNA fragments before they are contacted with the 
chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 

15 Spotting may be-performed by-using anrays of metd-pins-(the-positions-of-^ 

array of wells in a microtiter plate) to repeated by transfer of about 20 nl of a DNA solution to a 
nylon membrane. By ofiset printing, a density of dots higher than the density of the wells is 
achieved. One to 25 dots may be accommodated in 1 mm^ depending on the type of label used. By 
avoiding spotting in some preselected number of rows and colimms, separate subsets (subarrays) 

20 may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 
subarrays may represent replica spotting of the same samples. In one example, a selected gene 
segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 

25 prepared. By iising a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 

Subarrays may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm^ and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 

30 being similarto the sort oif meinbrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration of the 
present disclosure, one of sldll in the art will appreciate that many other embodiments and variations 
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may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present mvention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which are intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
5 functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
variations in the practice of the invention are expected to occur to those skilled in the art upon 
consideration of the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. 
All references cited within the body of the instant specification are hereby incorporated by 
1 0 reference in their entirety, 

5. EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained firom cDNA libraries prepared firom various 
1 5 human tissues and in some cases isolated from a genomic library derived firom human chromosome 
using standard PGR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PGR using primers specific for the vector sequences 
which flank the inserts. Clones firom cDNA libraries were spotted on nylon membrane filters and 
screened with oligonucleotideprobes (e.g., 7-mers) to obtain signature sequences. The clones were 
20 clusteredinto groupsof similar or identical sequences. Representative clones were selected for 
sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PGR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
25 (ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNAEnds) was performed to fiirther extend the sequence in the 5' direction. 

5-2 EXAMPLE 2 

ASSEMBLAGE OF SEP ID NO; 1, 2 and 3 

The novel nucleic acid (SEQ ID NO: 1) of the invention was assembled fi-om sequences that 
30 were obtained fcom a cDNA library by methods described in Example I above. The final sequence 
was assembled using the EST sequences as seed. Then a recursive algorithm was used to extend the 
seed into an extended assemblage, by pulling additional sequences from Hyseq' s database 
containing EST sequences that belong to this assemblage. The algorithm terminated when a 
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complete contig was assembled, Inclxision of component sequences into the assemblage was based 
on a BLASTN hit to the extending assemblage with BLAST score greater than 300 and percent 
identity greater than 95%. 

The nearest neighbor result for the assembled sequence (SEQ ID NC5. 1) was obtained by a 
5 FASTA version 3 search against Genpept release 114, using Fastxy algorithm. Fastxy is an 
• improved version ofFASTAaKgnment which allows in-codon frame shifts. The nearest neighbor 
result showed the closest homologue for each assemblage from Genpept (and contains the translated 
amino acid sequences for which the assemblage encodes). The nearest neighbor results is set forth 
below: 



AccessionNo. 


Description 


Smith-Waterman 
Score 


% Identity 


Z35597 


Unknown weak similarity with sea 
squirt nidogen precursor protein (blastp 
score 71); cDNA EST EMBL: 


760 


36.188 



10 

Polypeptides were predicted to be encoded by SEQ ID NO: 2 (or 3) as set forth below. The 
polypeptides were predicted using a software program called FASTY (available from 
http://festabioch.virginia.edu)which selects a polypeptide based on a comparison of translated 
• novel polynucleotideto known polypeptides (W.R. Vtsccson^ Methods in Enzymology^ 183: 63-98 
15 (1990), herein incorporated by reference). 



Predicted 
beginning 
nucleotide 
location 
correspond-ing 
to first amino 
acid residue of 
anodno acid 
segment 


Predicted end 
nucleotide 
location 
correspond- 
ing to last 
amino acid 
residue of 
amino acid 
segment 


Amino acid composition of the polypeptide encoded, 
wherein, (A=»Alanine, C^Cysteine, D?^Aspartic Acid, E« 
Glutamic Acid, F=Phenylalanine, G«GIycine, H=Histidine, 
Msoleucine, K=Lysine, L=Leucine, M=Methionine, 
N=Asparagine, P=Proline, Q=Glutamine, R=Arginine, 
S=Serine, T=Threonine, V=Valme, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop Cpdon, /=possible 
nucleotide deletion, \=possible nucleotide insertion) 


2669 


1388 


PRVRPRVRTDHNYYISRIYGPSDSASRDLWVNIDQME 

KDKVKIHGILSNTHRQAARVNLSFDFPFYGPIFLREITV 

ATGGFIYTGEVVHRMLTATQYIAPLMANFDPSVSRNS 

TVRYFDNGTALWQWDHVHLQDNYNLGSFTFQATL 

LMDGRIIFGYKEIPVLVTQISSTNHPVKVGLSDAFWV 

HRIQQffNVRRRTIYEYHRVELQMSKITMSAVEM 

PTCLQFNRCGPCVSSQIGFNCSWCSKLQRCSSGFDRH 

RQDWVDSGCPEESKEKMCENTEPVET^LEPPQP*ERQ 

PPSSGS*LPPE/DAVTSQFPTSLPTEDDTKIALHLKDNG 

ASIDDSAAEKKGGTLHAGLIVGILILVLiyATAILVTV 

YMYHHPTSAASIFFIERRPSRWPAMKFRRGSGHPAYA 

EVEPVGEKEGFIVSEQC (SEQ ID NO: 3) 
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5.3 EXAMPLES 

TISSUE EXPESSION ANALYSIS OF SEP ID NO: 1 

Tissue expression data for SEQ ID N0:1 are as follows: 



Tissue 


Vendor 


Librarv Name 

XJL\Ja%XL J A ^ Will V 


Adult brain 


Clontech 


ABR008 


Ovary 


Tnvitrofffin 


AOVOOl 


Cervix 


BioChain 


CVXOOl 


Endothelial Cells 


Stratagene 


EDTOOl 


Fetal liver-spleen 


Columbia University 


FLSOOl 


Umbilical Cord 


BioChain 


FUCOOl 


Lung, fibroblast 


Stratagene 


LFBOOl 


Lung, tumor 


Invitrogen 


LGT002 


Spinal Cord 


Clontech 


SPCOOl 
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5.4 EXAMPLE4 

BLAST ANALYSIS OF SEO ID NO; L 2 AND 3 

A BLASTN analysis for contig sequence 784_3137 was performed to identify sequences 
with potential homology to SEQ ID N0:1, 2 and 3. The database searched was the public 
nucleic acid database entitled /public/blastDBsInPipeLine/nt, posted 4:25:50 PM PST Dec 13, 
2000, using BLAST-2,0MP (©1996-2000 Washington University, Saint Louis, Missouri USA; 
Gish, W. (1996-2000) ht^://blast.wustLedu). The Query sequence ("Q") was the nucleic acid 
sequence for contig 784^3137 (SEQ IDN0:1). The Subject sequence C*S") aie the sequences 
identified in the BLASTN search with an E value of 0.01 1 or below. The results are as follows: 



15 



20 



25 



30 



Sequences producing High* scoring Segment Pairs: 

gb AF279144.1 |AF279144 Homo sapiens tumor endothelial , 
gi 9966886 ref |NM_020405 . 1 | Homo sapiens tumor endothe. 
gi 11525843 ref | XM_008292 .1 ] Homo sapiens tumor endoth, 
emb|AL034560.3lPFMAL3P8 Plasmodium falciparum MAL3P8, , 
dbj |AP000372.i1AP000372 Arabidopsis thaliana genomic D. 

F23M19 Arabidopsis thaliana chromosome 1, 
AF015472 Plasmodium falciparum microsate. 



gb|AC007454.3 
gb|AF015472,l 

erab|AL034558.2lPPMAL3P2 Plasmodium falciparum MAL3P2, 



gb 
gb 



AC007370.7 
AC004605.1 



AC007370 Homo sapiens, clone 22_A_3, com. 
HUAC004605 Homo sapiens Chromosome 16 BA. 



Smallest 
Sum 



High 


Probability 


Score 


P{N) 


N 


1543 


5.le-83 


2 


1543 


5.1e-83 


2 


1020 


l,3e-38 


1 


313 


0.0024 


1 


311 


0.0029 


1 


308 


0.0040 


1 


246 


0.0072 


1 


266 


0.0079 


2 


259 


0.0090 


2 


238 


0.011 


2 



>gb I AF2 79144.1 I AF2 79144 Homo sapiens tumor endothelial marker 7 precursor 
(TKM7} mRNA, complete cds 
35 Length => 2320 

Minus Strand HSPs: 
Score o 1543 (237.6 bits), Expect «= 5.1e-83, Sum P(2) - 5.16-83 
Identities « 561/833 (67%) , Positives = 561/833 (67%) , Strand = Minus / Plus 
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STACyiGACCACAATTACTATATATCTCGAATATATGGTCC - ATCTGATTCTGCCAGCCGG 2587 

I III Mill! II III I II II I lllll III I II I IIIIIII 

3GACA- ACOVCAGCTATTATGTGTCCOGTCTCTATGGCCCaVGC-GAGCCCCA^ 397 
3ATTTATGGGTGAAC».TAGACCAAATGGAAAAAaATA-AAGTGAAGATTCAT^ 2528 

II I lllll I I I II I II II llllllllll II MM I 

SAACTGTGGGTAGATGTGG-CCGAGGCXyUlCOGGAGCCAAGTGA^ 456 
3TCCAATACTCATCGGCAAGCTGCAAGAGTGAATCT- GTCCTTCGATTTTCCATTTTATG 24 6 S 

10 lllll II II lllll III I mill III MINI lllll II II II I 

CTCGAAGACCCACCGGCAGGCTTCGAGAaTGG- TCTTQTCCTTTGATTTCCCTTTCTACG 515 

SCCACTTC-CTACGTGAAATCACTGTGGaUlCCQGGGGTTTCATATACACI^AaAAG 241C 

I II I I II II I lllll I lllll II II lllll I II II II II 
SGCA- TCCTCTGCGGCAGATCACaiTAGCAACTGGAGGCTTCATCTTCATGQGGGATO 574 

STACATCGAATGCTAACAGCCaOlCTlGTACATAGCACCTTTAATC^ 235C 

I lllll lllll lllll II lllll I II II I lllll II III I III 

P^TCCATCGGATGCrCAOiGCTACr^^ CCC 633 

AGTGTA-TCCAGA-AATTCAACTGXKSlGATATTTTGATAATGaCACAGCyiCTTaTGaT^^ 2292 

I II III II II II II II II lllll lllll INI IIIIIII I 
TGGCTACTCC-GACAACTCCACTWSTOGTTTACTTTGACAATGGGACAGTCTTTGM 692 

AGTGGGACCATGTACATCTCCA- GGATAATTATAACCTGGGAAGCTTCACATTCCAGGCA 2233 

llllllllll II IIIIIII III I I I III II mil iiiiimi 

AGTOOOACCACOTTTATCTCCAAaOCTOaaAAGA- CAAGOGCAOTTTCACCTTCCAOOCA 751 



15 



20 



25 



40 





2645 


S: 


340 


Q: 


2566 


S: 


398 


Q: 


2527 


5: 


457 


Q: 


2468 


S: 


516 


Q: 


2409 


S: 


575 


Q: 


2349 


S: 


634 


Q: 


2291 


S: 


693 


Q: 


2232 


S: 


752 


Q: 


2173 


S: 


811 


Q: 


2113 


S: 


871 


Q: 


2055 


8: 


929 


Q: 


1995 


S: 


989 


Q: 


1937 


S: 


1047 




n ann 
Xo / / 


S: 


1107 


NO: 


4) 



30 Mill I III I II II II mm II mil ii iii i i mi 

GCTCTGCACCATG-ACGGCCGCATTGTCTTTGCCTATAAAGAGATCCCTATGTCTGTCCC 
ACAGATAAGTTCAACaUVTCATCCAGTGAAAGTCGGACTQTCCGATGCATTTGTCGTT^ 

I II II II II I lllll II III III II II mil II I 11 I 

GQAAATCAQCTCCTCCCAGCATCCTGTCAAAACCGaCCTATCGGATGCCTTCATaATTCT 
CCACAGGATCCAAOUUlT-TCCCyVATGT-TCGAAGAAGAACAATTTATGAATACCA 

I I Mil I II I iir I III nil I n i niiiiiiiiii 



810 



imii I II I nil III lllll mill i iimiii ii 



928 

1996 

988 



45 Q: 1995 ACa^TGCCTCCAGTTTAACAGATGTGGCCCCTGTGTATCTTCTCAGAT-TGGC-TTCAACT 1938 

III III II II nil I nil I II mm ii i iiinii 

ICCTGCAGCATAGGAGCTGTGACGCCTGCATGTC- -CTCAGACCTGACCTTCAACT 1046 
^, .TGGTGTAGTAAACTTCAAAGATGTTCCAGTGGATTTGATCQTCATCX^GCAGGACr 1878 

SO MM Mill 1 II II lllll IIIIIMI lllll II MM Mill I 

SCAQCTGGTQCCATQTCCTCCAQAGATGCTCCAOTGGCTTTGACCGCTATCGCC 
3GG-TGG-ACAGTGGA-TGCCCTGAAGAGTCAAAAGAGAAG-ATGTGTGAGAA 

III III II 111 II I- 1111 n I I -I II lllll in I. 
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Score B 471 (76.7 bits), Expect a 5.1e-83, Sum P(2} <» £f.le-83 

Identities = 227/363 (62%) , Positives = 227/363 (62%) , Strand - Minus / Plus 

Q: 1733 CTTCTCyvOT-TTCC-CACCAGCCTCCCTACAfiAAGATOATACCAAGATAQCACTACATCT 1676 

II III I III I IIIIIII I iiniiiiiimiiin I mini 

S: 1240 CTCCTCCCTCTTCUlTCQACAGCCTCACCACAGAAGATGACACCAAOTT-QAA-TCX:CTAT 1297 

100 



Q: 1675 AAAAGATAATGGAGCTTCTACAGATGACAGTG-CAGCTGAGAAGAAAGGGQGAACCC-TC 1618 

I ii I II I I nil I II I I III HUM II I 

S: 1298 GCAGGAOGAaACGGCCT-T-CAGAACAACCTGTCCCCCT^GAC-AAAGQGCACTC^ 1354 
5 Q: 1617 CACGCTGGCCTC-ATCGTTGGAATCCTCATCCTGGTCCTCATTGTAGCCACAGCCATTCT 1559 

III nil I I Mill II III I I mill I II II I III II 

S: 1355 CAC-CTGGGCACCATCGTGGGCATCGTGCTGGCAGTCCTCCTCGTGGCGGCCATCATCCT 1413 
Q: 1558 TG-TGACAGTCTATATGTATCACCACCCAACATCAGCAGCCA-GCATCTTCTTTATTGAG 1501 

10 III MINI II mill Mill II lllllllll IIJII 

S: 1414 GQCTGOAATT-TACATCAATaGCCACCCCACATCCAATGCTGCGC-TCTTCTTCATCG^^ 1471 
Q: 1500 AGACGCCCA-AGCAGATGGCCTGCGATGAAGTTTAGAAGAGGCTCTGGAOVTCCT^ 1442 

I I II I II Mill II lllllllll I II I Ml III I MM 

15 S: 1472 CGTAGACCTCACCAC-TGGCCAGCCATGAAGTTTCGCAGCCACCCTGACCATTCC^^ 1530 
Q: 1441 TGCrrGAAGTTGAACCAGTTGG--A-GAGAAAGAAGGCTTTATTGTATCAGAGCAGTC^ 1385 

111 II II II II II 1 mil II mil II I I iiiiiiiiii 

S: 1531 TGCXSGAGGTGGAGCCCTCGGGCCATGAGAAGGAGGGCTTCATGGAGGCTGAGCAGTGCTG 1590 



20 



25 



Q: 1384 AAA 1382 
I I 

S: 1591 AGA 1593 (SEQ ID NO: 5) 



>gij 9966886 ref |KM_020405.1 j Homo sapiens tumor endothelial marker 7 
precursor 

(TKM7) , mRNA 
Length - 2320 
30 Minus Strand HSPs: 

Score a 1543 (237.6 bits). Expect » 5.1e--83, Sum P(2) = 5.1e-83 

Identities - 561/833 (67%) , Positives = 561/833 (67%) , Strand = Minus / Plus 

Q: 2645 GTACAGACCACAATTACTATATATCTCGAATATATGGTCC-ATCTGATTCTGCCAGCCGG 2587 

35 I 111 mill II III I II II I mil II I I II I imm 

S: 340 GGACA-ACCACAGCTATTATGTGTCCCGTCTCTATGGCCCCAGC-QAGCCCCACAGCCGG 397 
Q: 2586 GATTTATGQGTQAACATAGACCAAATQGAAAAAGATA-AAGTGAAGATTCATaGAATATT 2528 

II I mil I I I II I II 11 iiimim ii iiii i 

40 S: 398 QAACTGTGGGTAGATGTGG-CCGAGGCCUUVCaSGAGCCAAGTGAAGATCCACACAATACT 456 
Q: 2527 GTCCAATACTCyiTCGGCAAGCTGCAAGAGTGAATCT-GTCCTTCGATTTTCCATTTTATG 2469 

mil II II mil III I mm iii mm mii.ii ii ii i 

S: 457 CTCCAACACCCACCGGGAGGCTTCGAGAGTGG-TCTTGTCCTTTGATTTCCCTTTCTACG 515 

45 

Q: 2468 GCCACTTC-CTAaBTGAAATCACTGTGGCAACCGGaGGrrTCATATACACTGaAGAAaTC 2410 

I II I I II 11 I mil >i mil II II mil i ii ii ii ii 

S: 516 GGCTl-TCCTCTGCGGCAQATCACCATAGCAACTCGAGGCTTCATCTTCAT^ 574 
50 Q: 2409 GTAOlTCGAATGCXAACaiGCCaiaiCAGTACATAGCACCTTTAATGQa 2350 

I Mill mil mil II mil i ii ii i iiiii ii iii i iii 

S: 575 ATCCATCGGATGCTCACAGCTACTCAGTATGTGGCGCCCCTGATGGCCAACTTCAA-CCC 633 
Q: 2349 AQTQTA-TCOiaA-AATTCAACTGTCy^GATATTTTGATAATGGCACAQCACTTGTO 2292 

55 I II III II 11 II II 11 11 mil mil nil iiiiiii i 

S: 634 TGGCTACTCC-GACAACTCCy^CAGTTGTTTACTTTGACAATGGGACAGTCTTTGTGGTTC 692 
Q: 2291 AGTGGGACCATGTACATCTCCA-GGATAATTATAACCTGGGAAGCTTCAaiTTCC^^ 2233 

miiiim 11 imm ii I i i i iii ii iiiiriiiiiiiii 

60 S: 693 AGTGGGACCACGTTTATCTCCAAGGCTGGGAAGA-CAAGGGCy^GTTTCACCTTCCAGGCA 751 
Q: 2232 ACCCTQCTC-ATGGATGGACGAATCATCTTTGGATACAAAQAAATTCCTQTCTTGGTCAC 2174 

I nil I III I II II II mm n inn ii iii i i iii i 

S: 752 GCTCTGCyvCCATa-ACGGCaSaiXTGTCTTTQCCTATAAAGAGATCCCTATCTCTGTCCC 810 
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Q: 2173 ACAGATAAGTTaU^CaUVTCATCCyWSTQAAAGTCGaACTGTCCmTOCATTTGTCe 2114 

I II II II II I mil II III III II II inn ii i ii i 

S: 811 GQAAATCAGCTCCTCCaVGCATCCTGTCAAAACCGGCCTATCGGATQCCTTC^ 870 
Q: 2113 CCyVCAGOATCCAACAAAT-TCCCAATGT-TCGAACaAAGAACAATTTATCa^ 2056 

I I nil I II I III I III nil I 11 I iiiiiiiiiiii 

S: 871 CAATCC-ATCCC-CGGATGTGCaVGAATCTCGGCGAAGGAGCATCTTTaAATACCACCGC 928 
Q: -2055 GTAGAGCTACAAATGTaVAAAATTACCAACATTTCGOCTOTGGA6AaX3ACCC<^ 1996 

iiiiiii I II I nil III mil Mini I iiiiini ii 

S: 929 ATAGAGCTGGACCCCAGCAAGGTCACCAGCATGTCGGCCGTGGAGTTCACCCOITTG^ 988 
Q: 1995 ACyVTQCCTCCAQTTTAACAGATQTGGCCCCTGTQTATCTTCTCAaAT-TGGC-TTCAACT 1938 

II iiiii III 11 II nil I nil I n iniii ii i iiiiiii 

S: 989 ACCTGCCTGCAGCATAGGAGCTaTaACGCCTGCATGTC--CTCAGACCTGACCTTCAACT 1046 
Q: 1937 GCAGTTGGTGTAGTAAACTTCaUVAGATGTTCCaVGTGGATTTGATCGTCATCGGCAGGACT 1878 

nil inn i ii ii inn iiniiii inn ii iiii inn i 

S: 1047 GCAGCTGGTGCCATGTCCTCCAGAGATGCTCCAGTGGCTTTGACCGCTATCX3CCAGGAGT 1106 
Q: 1877 GGQ-TGG-ACAGTGGA-TGCCCTQAAGAGTCAAAAQAGAAQ-ATGTQTGAGAA 1829 

III III II III II I I Ml III I I 11 inn ni i 

S: 1107 GGQATGGGACTATGGGCTGTGCACAGGAGGCAGAGGGGCAGGATGTGCGAGGA 1159 (SEQ ID 
NO: 7) 

Score = 471 (76.7 bits), Expect = 5.1e-83, Sum P{2) = 5.1e-83 

„Identities_=_.22jy3.63„(62.%) ^ _Positiy.es_ «_22 7/3.63„(5 

Q: 1733 CTTCTCAGT-TTCC-CACCAGCCTCCCTACyiGAAGATGATACCaVAQATAOCACTACATCT 1676 

II III I III I IIIIIII I nniiiiiii linn iiiiiii 

S: 1240 CTCCTCCCTCrrCATCGACAGCCTCACCAC^GAAGATGACACCAAGTT-GAA-TCCCTAT 1297 
Q: 1675 AAAAGATAATGGAGCTTCTACAGATGACMTG-CAGCTGAGAAGAAAGGGGGAACCC-T^ 1618 

IIII II I I IIII I II I I III MUM M I 

S: 1298 GCAGGAGGAGACGGCCT-T-CAGAACAACCTGTCCCCCAAGAC-AAAGGGCACTCCTGTG 1354 
Q: 1617 CACGCTGGCCTC-ATCGTTQGAATCCTCATCCTGGTCCTCATTGTAGCCACAGCCATTCT 1559 

III IIII I I IIIII II III I I llllll IIIII I III II 

S: 1355 CAC-CTGGGCACCATCGTGGGCATCGTGCTGGCA6TCCTCCTCGTGGCGGCCATCATCCT 1413 
Q: 1558 TG-TGACAGTCTATATGTATCSlCCACCCAACATCAGCAGCCa^-GaiTCT^ 1501 

I II I III M M llllll IIIII II II IIIIMI II III 

S: 1414 GGCTGGAATT-TACATCAATGGCCACCCCyia^TCCAATGCTGCGC-TCTTCTTCATC^ 1471 
Q: 1500 AQACGCCCA-AGCyVGATGGCCTGCGATQAAGTTTAGAAGAGGCTCTGGACATCCTGCCTA 1442 

I I II III IIIII II iinnni i ii i in in i iiii 

S: 1472 CGTAGACCTCACCaiC-TGGCCAGCCATGAAGTTTCGCAGCCACCCTGACCATTCCACCTA 1530 
Q: 1441 TGCTGAAGTTGAACCAGTTGG--A-GAGAAAGAAGGCTTTATTGTATCAGAGCAGTGCTA 1385 

III II II II II n I inn ii inn ii i i niiiiiiii 

S: 1531 TGCGGAGGTGGAGCCCTCGGGCCATGAGAAGGAGGGCTTCATGGAGGCTGAGCAGTGCTG 1590 

Q: 1384 AAA 1382 
1 I 

S: 1591 AGA 1593 (SEQ ID NO: 8) 



>gi 111525843 ref ] XM_008292 , 1 | Homo sapiens tumor endothelial marker 7 
precursor 

(TEM7), mRNA 
Length « 892 
Minus Strand HSPs: 
Score o 1020 (159.1 bits), Expect - 1.3e-38, P - l,3e-38 
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Identities = 366/543 (67%) , Positives « 366/543 (67%) , Strand » Minus / Plus 
3TACAj3ACCACAATTACTATATATCTCaAATATATGGTCC- ATCTQATTCTGCC^^ 2587 

I III iiiiii II III III II I mil II I I II I iiiiiii 

GGACA-ACCACAGCTATTATGTGTCCCGTCTCTATGGCCCCAGC-GAGCCCCACAGCCGG 397 
C3ATTTATGGGTGAACATAGACCAAATGGAAAAAGATA-AAGTGAAGATTCATGGAATATT 2528 

ir I Mill I II II I II II iiiiiiiiii II Mil I 

GAACTGTGGGTAGATGTQG- CCGAGGCCAACCGGAGCCAAGTGAAGATCCACACAATACT 456 
QTCCAATACTCATCGGCAAQCTQCMIQAQTQAATCT-QTCCTTCGATTTTCCATTTTATQ 2469 

Mill II M Mill III I MM!! Ill lIMM IIMI M M II I 

CTCCAACACCCACCGGOIGGCTTCGAGAGTGG-TCTTGTCCTTTGATTTCCCTTTCTACG 515 
GCCACTTC-CTACQTGAAATCa.CTGTGGCAACCGGGGGTTTCATATACACTGGAGAAGT^ 2410 

MM Mill I Mill I mil II II IIMI I M II II II 

GGCA-TCCTCTGOSGCAGATCaiCCATAGCAACTGGAGGCTTCATCTTCATGGGGGACGTG 574 
GTACATCQAATGCTAACAGCCACACAGTACATAGCACCTTTAATGQCAAA 2350 

I Mill Mill mil M mil Mill I mil ii iii i iii 

ATCCATCGGATGCTCACAGCTACTCAGTATGTGGCGCCCCTGATGGCCAACTTCAA-CCC 63 3 
AGTGTA- TCCAGA- AATTCAACTGTCAGATATTTTGATAATGGCACAGCACTTGTGGTCC 2292 

I II III M II II II II II Mill Mill MM IIMIII I 

l^CTACTCC-GACAACTCCACAGTTGTTTACTTTGACAATGGGACAGTCr^^ 692 
AGTGGGACCATGTACATCTCCA- GGATAATTATAACCPGaQAAGCTTCACATTCCAGGCA 2233 

IIIIIIIIII II ll.lllll III I I I III II lltll lllllllll 

AGTGGGACCACGTTTATCTCCAAGGCTGGGAAGA- CAAGGGCAGTTTCACCTTCCAGGCA 75 1 
ACCCTGCTC-ATGGATGQACGAATCATCTTTGGATACAAAGAAATTCCTGTCTTGGTCAC 2174 

I nil I 111 I II II II Mill! II mil II III II III I 

QCTerGKACCATa- AC<3GCCaCATTGTCTrTSCCTATAAAaAGATCCCTATOTCTO 810 
ACa,GATAAGTTaiACCAATCATCCAGTGAAAGTCGaACTGTCCX3ATGCATTTCTCGTTGT 2114 

I II II II II I IIMI II III III II II Mill II I II I 





Q: 


2645 


5 


S: 


340 




Q: 


2586 


10 


S: 


398 




Q: 


2527 




S: 


457 


15 


Q: 


2468 




S: 


516 


20 


Q: 


2409 




S: 


575 






2349 


25 


S: 


634 




Q: 


2291 


30 


S: 


693 




Q: 


2232 




S: 


752 


35 


Q: 


2173 




S: 


811 


40 


Q: 


. 2113 




S: 


871 



>emb|A]:i034560.3 |PFMAL3P8 Plasmodium falciparum MAIi3P8, complete sequence 
45 Length = 108,908 

Plus Stramd HSPs: 
Score = 225 (39.8 bits). Expect » 2*2, Sura P(2) = 0.88 

Identities « 159/273 (58%), Positives = 159/273 (58%), Strand = Plus / Plus 
50 Q: 265 ATAAACAAAAATATATTTATATATATATATTTATGTAACTACTATGTGCTTTAAAGAAAA 324 

Mill Ml III I IIIIMIIIMII Ml II II III I MM I I 

S: 89849 ATAAATAAATATACACATATATATATATATATATATATATA-TATATA-TATATATATAT 89906 
Q: 325 TTACTGTATGArrCAGCAGGGTTTTTTCATTCTTTCTATCGCCATQ-CTGAOT^ 382 

55 II II II I III IIIIII III II II III II nil I 

S: 89907 ATATATTAAAATA-ATCAGTTTATTATTAT-CTTACT-TCOTCATTACTATATTGAAATA 89963 
Q: 383 GCT-TCTTAGAaW3GAATCAGCAATGAAATGGGTGTTTAGTA-CT^ 440 

I I MM III MM III MM II I mil III 

60 S: 89964 TCCATACTAAA-ATAAATAA-CAACAGAATTT-TATTAAATATCAAAAAATGTTTTATAC 90020 
Q: 441 TAAAAATAATCATATGTTTAAAATCCCTTAACATTTTTGTTTC^^ 500 

II I M II M ill III mm I MM MM! 

S: 90021 -ATATTTTCTGACATTTTCCCAATATATTTATATTTTTACATGTAAAACAAATA-^GATT 90077 
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Q: 501 TCTTCA-TACTGTTTCCTTT-TGT-TTCCATCA 530 

I nil I II II 111 I III II III 

S: 90078 TATTCACTCCTATT-CCTATCTGTCTTATAACA 90109 (SEQ ID NO: 10) 
Score « 210 (37.6 bits), Expect « 5.2, Sum P(3) = 0.99 

Identities = 164/279 (58%) , Positives » 164/279 (58%) , Strand = Plus / Plus 
TATAAACAAAAATATATTTATATATATAT- AT" TTATGTAACTACTATGTGCTrTAAAGA 321 

MM I I I mill iiiiiiiiiii II III! I II III I mil II 

TATATATATATATATATATATATATATATTATATTATCAACCAAATAT-TCCTTTAT-GA 68697 

- AAAT"TACTGTATGATTaU3CAGGGTTTTTTCATTCTTTCTATCGCCATQCTGACGTC 379 

Mil III III III I I M I I I M I III I II I 

CAAATATAAT-TATTATTAATATATATAATATQAATGTACCaUl-CAA- ATTCA-AQQTAA 68753 
-ATGGCTTCTTAGAO^GGAATCAGCAATGAAATGGGTGTTTAGTACTQAAAAGGCA 438 

III MM! mil I I ill I I I I I lllll ill 

TATGACTTATCATAT-QGAATATATACTa^AAAAG-TATA-ACAAAA-AAAAQA-ATATT 68808 

ATTAAAAAT- -AA-TCATATGTTTAAAATCCCTTAACATTTTTGTTTCAAA- -AATG-AT 492 

lllll I II I I II III I II III mil I II III III II 

- TTAAAGGTTCAAGTAAAATTTTTTATATTTCTTT - CATTTATTTTACAACTTAATATAT 68866 

TA-AGGAACTCTTCATACTGTTTCCTTTTGTTTCCATCA 530 

II I III I I Ml I I 11 II II II I 



10 


Q: 


264 




S: 


68640 




Q: 


322 


15 


S: 


68698 




Q: 


380 


20 


S: 


68754 




Q: 


439 




S: 


68809 


25 


Q: 


493 




S: 


68867 




30 


Score B 
Identit: 




Q: 


29 


35 


S: 


40897 




Q: 


87 




S: 


40954 


A A 

40 


Q: 


138 




S: 


41014 


45 


Q: 


195 




S: 


41074 




Q: 


250 


50 


S: 


41131 




Q: 


310 




S: 


41184 



55 



187 (34.1 bits), Expect » 2.2, Sum P(2) - 0.88 

.es = 179/308 (58%), Positives « 179/308 (58%), Strand = Plus 

TTTATXTAGCCAGAGAGGGAAGGG-GTTGACTVTA/^CGAAA-AAGTGGATCAAATAGTC^ 

lllll im I) I II III! III! Ill III I II I II I 
TGTTTCTATACAGATAGATATGGATGATGG-A-AAACCAAAGAAG-GAATTTACTAATAT 

AGA-AC-ATGATGGGCG-"C-GGC-AATGA-ACTGAACC:aVCTTT-TGCTAAGTGACA-QA 

II II II III I I II II I I mm II II I I I I II 

GGATACTATCTTGGAAGATCTGGATAAATATAATGAACerTAraATQATGTGa^^ 
AAAATATTCTAATATTAAGGATTATTTTACyUlCT-CTATGGA-AGTAATGCMT^ 

I III! Mil II III II I II II nil nilllll 11 II 

TATTTATTATGATGTAAATGATOITGATGTATCAACTGTGGATAGTAATGCTATGGATAT 

ATCTTGCATC"TGTT--TTGTCTTQ-ATGACAAAAa3CACTCTTAGAGTCACA-AGATCC 

I II III I III II III II II II III I I II It 

ACCTAGCAAAGTACAAATTGAAATGGATGTAAATAC- CAAA-TTGGTGAAAGAGAAATA- 

TGCCTTGTGTTAGTTATAAACAAAAATATATTTATATATATATATTTATGTAACTACTAT 

I III I II I II II nil iimmim mm ii in 



I II I 

STTTGA 41191 (SEQ ID NO: 12) 
Score « 157 (29.6 iDits) , Expect « 5.2, Sum P(3) - 0.99 

Identities = 111/188 (59%) , Positives = 111/188 (59%), Strand « Plus / Plus 



Q- 1029 AAAGCTTTAT-TATGATTTTTTTTTTCTGATCCCTTTGCyVAC-C-CTO 1085 

60 III II I III IIIIIIIUIIII I ill III I III i II III 

S: 83679 AAATATTQAAGTATAATTTTTTTTTTCTTTTTTTTTTACAJVATCACTGaATaAAATTCAA 83738 



104 



10 





1086 


A- AGC- - ATTATAATCTTGTCATACT- TCAGAT- AAGTCCACGGGAGATGTTCCGAGTGA 

1 i 1 llllllll 11 1 III 1 1 1 II III II II II 
AGATCTAATTATAATATTAT-ATATTATAAAATTAAGGGCAAAAAAAAAAAAAAAAAA-A 


1140 


S: 


83739 


83796 


Q: 


1141 


ACTATAQATGACATTCCACTAGGGAATTCTATGTTCAGTGTAAATGGTATCTTGTATAAG 

II III II 1 II II 1 MM III 1 II 1 II III II 1 Mil 

AC-ATATAT-ATATATAAAAAATTATTTCCA-GTTAAATGAATATAATATGTT-T-TAAG 


1200 


S: 


83797 


83851 


Q : 








S: 


83852 


11 1 11 

TTACACTT 83859 (SEQ ID NO: 13) 





Minus Strand HSPs: 
Score = 313 (53.0 bits). Expect » 0.0024, P » 0.0024 
15 Identities = 235/401 (58%) , Positives = 235/401 (58%) , Strand = Minus / Plus 

Q: 513 AACAGTATGAAGAGTTCCTTAATCATTT--TTGAA-ACAAAAATGTTAAGGGATTTTAAA 457 

INI Mil II III! I II III I I MM III I Mill 

S: 92109 AACAACTTGAAATTTTTATTAAAAAATTAQTTGTTCAAAAAAAAAATAAATTAAATTAAA 92168 

20 

Q: 456 a^TATGATTATTTTTAATT-TTATGCCTTTTCAGTACTAAACACCCATT-TCATTGC-TG 400 

MM nil II II 111 I nil I I lilt I I Mill I 

S: 92169 -ATATTATTAAGTTO^CATTA-GT-TTTTAAA-ArrAAAGAATAAAAATCATTATGTA 92224 
25 Q: 399 -AT-TCC-TQTCTA-AGAAGCCATTCACGTC-AGCATGGCQATAGAAAGAATGAAAAAAC 345 

III I I II I II 11 I I I I III III II HUM 

S: 92225 TATATATATATATATATAATATATATATATATATTAAAATAATAAAAAAAAAAAAAAAAA 92284 
Q: 344 CCTGC-TGAATCATACAGTAATTTTCTTTAAAGCACATAGTAGTTACATAAATATATATA 286 

30 III I Hill III iiiiiii I I I I III II I nil iiiiiiiii 

S: 92285 A-TQCATAAATCA-ACATTAATTTTATAAATTG-ATATA-TACGT-CATATATATATATA 92339 
Q: 285 TATAAATATATTTTTGTTTATAACTAACACAAGGCAGGATCTTGTGACTC-TAAGAGTC 227 

nil nun i i i in inn i i i i i i inn 

35 S: 92340 TATATATATATATATATATATOTATAACA-ATTATTTCA-CAAATTAATAATAAGACATA 92397 
Q: 226 GTTTTGTGATCAAGACyUAA-CAGATGCAAG-ATGC-ArCACTGCAT^^ 170 

n I I nil I nil ii i i i n ii i n in i iiii i 

Si 92398 TTTATATATTCAATAAAAAATCAAAAGGATTTATAATATAAATQTArT--TAGCATATAT 92455 

40 

Q: 169 TTGTAAAATA-ATCCTTA-ATAT-TAGAA-TATTTTTCTGT 133 

I llllll II II IIII II II III 111 111 
Si 92456 AAGAAAAATATATATATATATATATACAACTATATTTTrQT 92496 (SEQ ID NO: 14) 

45 Score = 259 (44.9 bits), Expect « 0.69, P = 0.50 

Identities » 233/407 (57%) , Positives = 233/407 (57%) , Strand = Minus / Plus 





Q: 


528 


ATGGAA-ACAAAAGGAAACAGTATGAAQAGTTCCTTAATCATTTTTGAAACAAAAATGTT 

llllll III III inn 11 III inn ii ii ii 


470 


50 


S: 


68767 


ATGGAATATATACTCAAAAAGTATAACAAAAAAAAGAAT-ATTTTA-AAGGTTCAA-GTA 


68823 




Q: 


469 


AAGGGATTT - TAAACATATGATT - ATTTTT - AATTTTATGCCTTTTCAGTACTAAACACC 


413 


55 


S: 


68824 


II III n 1 1 III inn ii ii ii n i i i ii ii 

AAATTTTTTATATTTCTTTCATTTATTTTACyVACTTAATATATTATATGAAATATACQTA 


66883 


Q: 


412 


CATTTC-ATTGCTGATTCCTGTCTAAGAAGCCATTCACGTCAGCATGGCGATAGAAAGAA 


354 




S: 


68884 


1 1 IIII II II II II II Mill II MM M IIII 

TTTATATATTG - TGGTTA- TATG - AA- AATATATTCA- - TAAT- ATG- CTATTTATAAAA 


68935 


60 


Q: 


353 


TGAAAAAACCC - - TGC - TGAATCATACAGT - AATTTTCTTTAAAGCACATAGTAGTTACA 

llllll 1 1 1 III 1 1 1 1 II II llllll II II III 

AAAAAAAATTGAATACATTAATTAAAAAATTAAAATT-TTTAAAAATTATTATA-TTA- - 


298 




S: 


68936 


68991 



105 



WOOl/52616 



Q: 297 TAAATATATATATATAAATATATTTTTGTrrATAACTAACAC»AGGCAGGA-TCTT6T6A 239 

II iiiiiiiiiiiii mill II mm i i ii i i i i ii i i 

Si 66992 TACATATATATATATACATATATGTTA"TTTATATTTTATTCATT0AAAAAATATT-r-A 65048 
5 Q: 238 CTCTAAGAGTGCGTTTTGTCyiTCTUlGACyiAA-ACAGATQCAAGATGC^ 180 

I I 1 I I I II I II II III nil I I II II I nil 

S: 69049 TTTTTATATTTTATATT-T-ATAAATCCAACCACAGTTTTATTATAAATTAAAATATTAT 69106 
Q: 179 T-TCCATAGAGTTGTAAA-ATA-ATCCTTAATAT-TAGAATATTTTT 137 

10 II III I mn I III II mil i i ini i i 

S: 69107 TATTTATAAAATTGTATAGATATATTCTTAAAAAATTCTATATATAT 69153 (SEQ ID 
NO: 15) 

Score a 254 (44,2 bits), Expect = 1.2, P = 0.69 
15 Identities « 234/412 (56%) , Positives = 234/412 (56%) , Strand « Minus / Plus 

Q: 514 AAAO^GTATGAAGAGTTCCTTAATCA-TT-T-T-TGAAACAAAAATGTTAAGGGATTTTA 459 

nil III III im I II I M III mil ii i iii 

S: 87283 AAACCATCTTATAATTTAAATAATGAATTGTGTATCAAAAAAAAAAAAAAAAAAACCTTA 87342 



20 



Q; 458 AACATATGATTATTTT-TAATTTTATGCGTTTTCAGTACTAAACACCC-ATTTCATTGCT 401 

II im I I 11 iiiiiim n I i II II I I ii i iii i 

S: 87343 AAAATATCAAAACATreTAATTTTATTT-TTAT-A-TAATATATATTATATATAATTAAT 87399 



25 0: 400 GATTCCTGTC-TAAG-AAGCCATTCACGTCAGCATGGCGATAGT^OAATGAAAAAACCC 343 

II I III II II I II III I I III III nil I 

S: 87400 -ATAAAGAACATAATTAAAATGTTTAATTC-GCAGATAAAAAAAAAAAATTAAAACTATC 87457 



Q: 342 TGCTGAATCA-TACAGTAATTTTCTTTAAAGCACATAGTAGTTACATAAATATATATATA 284 

30 I III I II I III I I I II I I III II II III imiiniii 

S: 87458 -GTAAAATAAATAAA-TAAATATATATATAT-ATATA-TATATATATATATATATATATA 87513 
Q: 283 TAAATATATTTTTGT-TTATAAC-TAACACyUVGGCAGG-ATCT-T-GTGACT-CTAAGAG 230 

II nil I 1 I I linn ni i in i ii in i i ii i 

35 S: 87514 TATATATTTATATATATTATAATGTAAAAAAAGATATATATTAATCGTTTATGCAAACA- 87572 
Q: 229 TGCGTTTTGTCA--TC-A-AGACAAAACAGATCCAAGATGCATCACTGCaiTT 175 

I III n I I I I nil M II I I III III n i 

S: 87573 TAAAAATTGACTU^ATATATAAAAAAAAAAAAAAAAAAAAAAAAAACTT-ATTTIT^ 87631 

40 

Q: 174 TAGAGTTGTAAAATAATCCTTAATATTAGAATATTTTTCTGTCACT-TAGCA 124 

II I III nil I I nil I ninii i i i i mn 

S: 87632 TAAAAAAATAAGATAAACTTCAATACAA-AATATTTAT-TAAAAATATAGCA 87681 (SEQ ID 
NO: 16) 

Score « 236 (41.5 bits), Expect = 7.7, P = 1.00 
Identities « 286/505 (56%) , Positives » 286/505 (56%) , Strand * Minus / Plus 

Q: 532 GCTGAT-GGAAACTVAAAGGAAACAGTATGAAGAGTTCCTTAATCATTTTTGAA^ 475 

so I nil III I nil null I I I mn i ii i nil i i 

S: 63353 aOTGATAG<3AGA-AAAA--AATCATTTTATAAA-TA--TTAATAAOTTAT-AAACTATAT 63405 
Q- 474 ATGTTAAG(3GATTTTAAAaVTATGATTATTTTTA-AT--TTTAT-GCCTTTTCAQTACTA 419 

I I II It II I II I mil n n in i i mi ii ii 

55 S: 63406 AAATAAATCTATAGTATA-ATTTTTCTATTTATATATGATTTCTAGTAGTTT-ATTATTA 63463 
Q' 418 AACACCCATTTa^TTQCTGATTCCTGTCTAAGA-AGCCATTCACGTCAGCATGGC^ 360 

INI III I II I II II II II III I I MM 

S: 63464 TTATTTTATTTAATTAAT-AGTAATAAATACTATAACC-TTATTTTCATAAAQA--ATAG 63519 



45 



60 



0- 359 AA-A-GAATGAAAAAACCCTG-CTGAATaVT-ACAGTAATTTTCTTTA;^ 305 

II nil II I II II III I I I I II nil I im i 

S: 63520 TATATGAATATTATTAQTATAACTTAACC^TTATATTTCTATTTTTTACATTCCATACAT 63579 



106 



10 



15 



20 



25 



30 



35 



40 



Q: 


304 


AGTTACATAAATATATATATATAAAT-ATATTTT-TGTT-TATAACTAACACAAGGCAGG 

1 III III llllllllllllllll Mill 1 mil 1 1 III III 
A-TTATATATATATATATATATAAATTACATTTCATAACATATAAATTATA-AATGCAAT 


248 


S: 


63580 


63637 


Q: 


247 


ATCTTGTGACT- CTAAGAGTGCGTTTTGTCaiTaUlGACAAAACA- G- ATGCAAG^ 

1 1 II II Ml M 1 1 III Mini MM 1 MM 

AAAATCATATTACTTAGAAA- - - TATAATAT^CAACACAAAAAATGTATATTATTATGGA 


192 


S: 


63638 


63694 


Q: 


191 


T- - CACTOCATTACTTCCATAQAGT- TGTAAAATAATCC- TTAATATT- AOAATATTTTT 

1 III 1 11 1 II 11 II MM 1 mill 1 1 IMIIIII 

TATCATTATAATATGTTT-TAAATTCTATTTAATATACAATTAATAATGATCATATTTTT 


137 


S: 


63695 


63753 


Q: 


136 


-CTGTCACTTAGCAAAAGTGGTTCAGTTCATTQCCGCGCCCATCATGTTCTTQAC-TAT- 


80 


S: 


63754 


1 1 MM MM 1 II II III II II II i III i II 
TCCTTAACTT-GAAATAATATTTTT-TT-ATTATATCGTATATAAT-TCCTTTAAATACA 


63809 


Q: 


79 


TTGATCCACTTTTTCGTTTATGTCA 55 




S: 


63810 


III 1 1 II 1 mil III 

TTG-TAAAAATTATA-TTTAT-TCA 63831 (SEQ ID NO: 17) 




Score =» 144 (27.7 bits), Expect a 0,024, Sum P(2) « 0,023 

Identities - 148/276 (53%) , Positives o 148/276 (53%) , Strand = Minus / Plus 


Q: 


1407 


TTTATTGTATCAGAGCyWSTGCTAAAATTTCTAGGACAGAACAACACCAGTACTQQTTTAC 


1348 


S: 


69280 


III II n 1 i II III II 1 II 1 II II 1 II 1 II 

TTTTTTTTTTTAAATAATTA- TATACATTGAATGAAACAATAATTATATTATTATAATAT 


69338 


Q: 


1347 


AGGTGTTAAGACTAAA-ATTTTGC-CTATACCTTTAAGACAAAC?^CAAAC^^ 

IMIIIII Mill Mill 1 III III 1 1 III II 

AAATTTAAATATTTTATATTTTTCACTTTAT-TATAAATTAAATATT-ATACATTAATAT 


1290 


S: 


69339 


69396 


Q: 


1289 


AAACAAGCTCTAAGCTGCTGTAGCCTQAAGAAGACAAGATTTCTGGACJ^GCTCAGCCCA 


1230 


S: 


69397 


II III III MM M III Ml II 1 M III 

TAA^AAGTTCTTTTTTGTTTTTATTTCATATAGATTAAATCTATTATTTATATAAACC-A 


69454 


Q; 


1229 


GQAAACAAAQQGTAAACAAAAAACTAAAACTTATACAAQATACCATTTACAC- -TGAACA 


1172 


S: 


69455 


III II lllllllllll MM III III II 1 II II II 
TTAAATAATATATAAACAAAAAAAAAAAAATTAAATAACTTAAAAAAATU^GATGCTCA 


69514 


Q: 


1171 


TA-GAATTCCCTAGTGGAATGTCATCTATAQTTCAC 1137 

M III 1 II 1 III MM Ml II II 

TATGAAAAACATA-T--AAT-TCATATATGATTTAC 69546 (SKQ ID NO: 18) 




S: 


69515 





45 



50 



55 



60 



>dbj |aP0OO372 .i|aP000372 Arabidopsis thaliana genomic DNA, chromosome 5, TAC 
clone :K23F3 
Xiength » 36, 824 
Minus Strand HSPs: 
Score » 311 (52.7 bits), Expect = 0.0029, P o 0.0029 

Identities = 265/448 (59%) , Positives = 265/448 (59%) , Strand = Minus / Plus 



I IIIII IIM III I I II mil II III I Mi l l I MM 

TTrTAAGCCCCAATAAATCTACTACTAGTTGTAATGCTGCTTACGAAAGAGATTACGA 
- OA- GTATGAAGAGTTCCTTAATCATTTTTGAAACAAAAATGTTAAGGGATTTTAAACAT 

11 II I II MM II mill I IIIM I Ml MM I 

TCATGTTTTCACAATACCATTGTCTCTTTTGAGA-AAAAA-Q--AAGACCCTATGAAT--T 
ATGATTATTTTTAA-TTTTATGCCTTTTCAQTACTAAACACCCA-TTTCATTGCTO^ 

I I 11 MM MM I MM I I I MM I I MM II I II 



Q: 


564 


S: 


4872 


Qt 


511 


S: 


4932 


Q: 


453 


S: 


4987 



107 



wo 01/52616 PCT/USOO/35190 
rQTC-TAAGAAQCCATTOlCGTCAGCyVTGGCmTAaAAAGAATaAAAAA^ 339 

I I III III III! II II II 1 Mill II mil I I 

rATAATAATAAGTAGTTaUV--CATTATT-a3TTTTAJUlGA-TGTTAAAACTATAAA^ 509< 
AATCATACA- -G-TA-ATTX-TCTTTAAAGCACATAGTAGTTACATAAATATATATATA 284 

III III I II II I I I II I I III II II III llllllillll 

AATTTTAAATAGATACATATATATATATAT-ATATA-TATATATATATATATATATATA 515: 

AAATATATTTTTGTTTATAACTAACAaUVGGOWmTCTTGTGACrCTAAaAG 224 

10 ~ llllllillll llllllillll II I I I I I I I I I I I II 

rATATATCTTTTTCGAAAAAACTAACAAATGTTAGTTTTTTTTTTCTGTTAAAGTTAATT 521' 

TTGTCATCAAGACaUUU^CAGATGCAAGATGCATCavCTGCATTACTTCCAT^ - -TG 167 

INI lllll I I II I II M III I I I III I M 

XA-TAAACCT- ACAAAGTAT-TATAAT-TAAATG- -TG- ATTT-TACAAAAGAATACATQ 526 i 
T-AAAATAATCCTTAATATT-A-QAATA 142 

I MM II lllllll I Mill 



15 



Q: 


395 


S: 


5044 


Q: 


338 


S: 


5100 


Q: 


283 


S: 


5158 


Q: 


223 


S: 


5218 


Q: 


166 


S: 


5270 



20 

>gb|AC007454.3 |P23M19 Arabidopsis thaliana chromosotne 1 BAG P23M19 sequence, 
complete sequence 
Length » 88,401 
25 Plus Strand HSPs: 

Score « 308 (52.3 bits), Esqpect = 0.0040, P = 0.0040 

Identities « 280/490 (57%) , Positives = 280/490 (57%) , Strand = Plus / Plus 



Q: 53 GTTGACATAAACGAAAAAGTGGATCAA-ATAGTCAAGAACATGATGGGCGCGGCAATGAA 111 

30 III II II III I III I II II I llllll I I II 

S: 54392 GTTTTAATTGACTTTAAACTTT-TCAGGAAAGCTAA-AGCATGATTTQTTGAACTCTO 54449 
* Q: 112 -CTGAACCACTTTTGCTAAGTGACAGAAAAATATTCTAA-TATTAAGGATTATT-T-TAC 167 

II III I Mil I II I MM III II I III II II I II 

35 S: 54450 GCTAAACAATTTTTT-TGGCTGT-A-AAAAGTTTTTAAAGTCTTACCAATCATCCTCTAT 54505 
Q: 168 AACTCTATGGT^GTAATGCAGTGATGCATCTTGCyiTCTGTTTTGTCTTGATGACAAAAC 226 

II M I I II M Ml I I i II II I Ml MM 

S: 54507 ATCACAAAAGTTGGAQTAAAOAQATTCCAAATAC-TATTTTAAATACCAAGGATTAAACT 54565 



40 



Qi 227 GCACTCTTAGA-GTCACAAGATCC-TGCCTTGTQTTAGTTATAAACAAAAATATATTTAT 284 

llllll I II I lllll III III III MM II I Ml 

S: 54565 GGA-TGTTTGTCGGCATATTATACGTGA-TT-'T-TTACAAATAGT-AAAAGCAT-TATAT 54619 



45 Q: 285 ATATATATATTTATGTAACTACTATGTGCTTTAAAGAAAA-TTACTGTATGATTCAGCAG 343 

llllllllll III II II lIMM I MM I I II II III II 

S: 54620 ATATATATATATATATATATA-TATGTG-TATAAATTATAGTTTCTCTATTCAAATTAAG 54677 
Q: 344 GGTTTTTTCATTCTTTCTATCGCCATGCTGAOGTGAATG-GCTTCTTAGACaiGGA^ 402 

so III III lllllll I II I I I 11 II I il I I III 

S: 54678 TGTTATTTTATGATCTTTATTTTTGT-CTCAAATTA-TGTGC — CGTATAAATC--TGAG 54731 
Q: 403 CAATGAAATGG-GT-GTTTAGTACTGAAAAGGCATAAAATTAAAAAT-AATCATATGTTT 459 

II MM I II II II II I II Ml MM I MMIII III llllll 

55 S: 54732 AaATAAAATTGTGTAaTGTATTAAT-AAaAGGTATAATAATAAAAATTAATTATATGTAC 54790 
Q: 460 -AAAATCCCTTPAAC-ATTTTTGTTTCAAAAATOATTAAGGAACTCTTO^TAC-TGTT^ 516 

Mill I MM M II llllll III III II I I MM 

S: 54791 GAAAATTCTTTAAAGATACTTAAAAACAAAATGGAGATGG- -CTATA-ATGCGTAATTCC 54847 



60 



Q: 517 TTT-T-GTTT 524 

Ml I MM 

S: 54848 TTTATTGTTT 54857 (SEQ ID NO:20) 



log 



>gb|AF015472.l|AF0l5472 Plasmodium falciparum microsatellite ps90 sequence 
Length - 375 
Minus Strand HSPs: 
Score = 246 (43.0 bits), Expect = 0.0072, P ^ 0.0072 

Identities = 164/276 (59%), Positives = 164/276 (59%), Strand = Minus / Plus 
Q: 525 GAAACAAAAGG2UACAG-TATGAAGAGTTCCTTAATCATTTTTGAAACAA;UATC 469 

nil I II III I III II I I Ml I II mill II III I II 

S: 45 GAAATATAATAAAATGGATATAAAAATATTTTTA-TGTTTGTTGAAAGAAGAATAATATA 103 

Q: 468 -AGGGATTTTAAACATATGATTATTTTTA-ATTT-TAT-GCCTTTTCAGTA-CTAAACAC 414 

i III I I I II III II I II II III I I II II- I II I 
S: 104 TATaTATATQATGAAQAT-ATTTTTATAAGAQTTATATAQAATATTAAGAAACTTTATTA 162 

Q: 413 CCATTTCATTGCTGATTCC-TGTCTAAGAAGCCATTCAC-GTCAGCATGGCGATAGAAAG 356 

III II I INI I II I II II I II II INI II 

S: 163 ATATTGTATATAT-ATTCAATATCATATTAGAAATAAATTGTTTATATAATGATAAGAA- 220 

Q: 355 AATGAAAAA-ACCCTGCTGAA-TCATAC-AGTAATTTTCT-TTAAAGCACATAGTAGTTA 300 

II III II II I I III III I lllll I III I I III II II 
S: 221 AA-GAAGAATACTT^ACAAAAATGTTACCAAerATTTTGTATTATAT- 277 

Q: 299 CATAAATATATATATATAAATATATTTTTGTTTATA 264 

III lllllllllllll Mini I I I MM 

S: 278 TATATATATATATATATATATATATATATTTATATA 313 (SEQ ID N0:21) 



>emb|AL034558.2|PFMAL3P2 Plasmodium falciparum MAL3P2, complete sequence 
Length » 153,098 
Plus Strand HSPs: 
Score =. 307 (52.1 bits), Expect « 0.31, Sum P(2) =. 0.26 

Identities « 245/425 (57%), Positives = 245/425 (57%), Strand « Plus / Plus 



Q: 


121 


TTTTGCIAAGTGACAGAAAAATATTCTAATATTAAGGATTATTTTACAACTCTATG 


180 


S: 


147 


1 II lllll III II II III 1 II III II II II III 
TGTTAATAAQTTTTGGAATAAGATA-TAACTTAAAATTTTAACATATCATTTGATAAAAG 


205 


Q: 


181 


TAATGCAGTGATGCATCTTGCATCTGTTT-TGTCTTGATGACAAAACGCACTCT-TAGAG 

II 1 1 1 II M 1 II 1 III III 1 M 1 1 1 1 1 1 II 1 
TATTATA-T-ATATATATAT-ATTTATTTATGTATATATAATATAT-QAAAAATATAAAT 


238 


S: 


206 


261 


Q: 


239 


TCACAAG-ATCC-TGCCT-TGTGTTAGT-TATAAACa^AAAATATATTTATATATAT^^ 


294 


S: 


262 


III 1 II Ml 1 MM 1 1 MM 1 1 1 IIIMI IIIMininil 

ATACATCTATTTATQCGTCTGTGCAAATATATATATATATATATATATATATATATATAT 


321 


Q: 


295 


TTATQTA-AC"TACTATGTQCTTTAAAQAAA-ATT"ACTG-TATGATTCAQCAGaGTTTT 


349 


S: 


322 


III II 1 II III 1 1 MM 1 1 III M III Ml 1 1 M 

ATATATATATATAATATTTA-TATAAATATATATTTACACATATATTTCTTTAATGGGTT 


380 


Q: 


350 


TTCATTCTTTCTATCQCC- ATGCTGACGTGAATGGCT- - TCT-TAGA- - CAGGAATCAGC 


403 


S: 


381 


III 1 III II 1 1 1 II 1 1 1 II 1 1 III II 
AACATATATAATATTTAATATA-TAATATATATATTTAATATATATATCCTGGACGAAGG 


439 


Q: 


404 


AA-TGAAATGGGTGTTTA-GTACTQAAAAGGCATAAAATTAAAAATAATCATATGTTTAA 


461 


S: 


440 


M 1 MM 1 MM II III 1 1 lllll MMIIII 1 II 1 III 

AAATTAAATAA- TATTTATGTGGAGAATAAAAA- AAAATAAAAAATAAAAAAAT- TCTAA 


496 


Q: 


462 


AATCC - CTTAACATTTTTGTTTCAAAAATGATTAAGGAACTCTTCATACTGTTTCCTTTT 

1 II 1 111 III III II Ml II 1 1 lllll 1 1 III nil 

ATTTCACCTTAGGTTTATGTG-CAGTA-TGCTTTATATATTCTTCTTTTTTTTTTTTTTT 


520 


S: 


497 


554 



109 



wo 01/52616 



PCT/USOO/35190 



Q: 521 GTTTC 525 



S: 555 -TTTC 558 (SEQ ID NO; 22) 

S Score » 273 (47.0 bits), Expect » 0.16, P a 0.15 

Identities » 281/495 (56%),, Positives = 281/495 (56%), Strand = Plus / Plus 





Q: 


55 


TGACATAAACGAAAAAGTGGATCAAAT-AGTCAAGAACATGATGGGCQCGGCAATGAACT 
1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 t 1 1 1 1 11 1 1 1 1 1 1 

1 1 IIMII III 1 III 1 1 II 1 II III! 1 1 III II 1 

TAAAATAAACCAAATTATTGATGATTTTATTTTAAAA-ATGA- - -GAGAAA-AATAAAAT 


113 


10 


S: 


117106 


117160 




Q: 


114 


GAACaVCTTTTGCTAAGTG-ACAGAAAAATAT-TCTA-ATATTA-AGG-AT-TATTT"TA 

II 1 1 1 1 1 II III 111 iiii-i 1 iiiii 1 II III 1 II 

III MM II IIMIIMII M IIIII 1 IMII Ml 

ATACyVAAATATAO^TATQTACA-AAATATATCTGTTTATATTQTACyia^TATATATAT^ 


166 


15 


S: 


117161 


117219 


Q: 


167 


CTUlCTCTATGGAAGTA-ATGCAGTG-ATGCATCTT-GCyVTCTGTT-TTGTCTTGATGACA 

1 iiii 1 till II II III11II 1 III 1111 11 

1 MM 1 Mil 11 II 1 II 11 II 1 1 1 II 1 II 11 


222 




S: 


117220 


TATATATATATATATATATATATTATATATATTTTAGCAAATCATATTAT-TTCA--ATA 


117276 


20 


Q: 


223 


AAACGCACTCTTAGAGTOlCAAGATCCTGCCTTGTGTTAGTTATAAAaUUUATATATOT 
II IIIII 1 It 1 iiiiiiiiiiii IIII 

1 1 1 IIII 1 II 1 II MM IIII II MM 

TOATQTCTTTTTAGTTTTTTTAAAAATTATATTTTGTT-QTTTTTA- CTGCQ- TAAAAT - 


282 




S: 


117277 


117332 


25 


Q: 


283 


ATATATATATATTTATGTAACTACTATG-TGCTTTAAAGAAAATTACTGTATGATTCAGC 
IIIIIIIIIIII iiiiit 1 1 III IIIII 1 III 1 1 111 1 t 1 


341 


S: 


117333 


IIIIIIIIIIII llllll 1 1 III IIIII MM 1 IMII II 

ATATATATATATATATGTATTTQC- ATGATGCTT - - ATGAATTTCTTTTTATAAGTAAAG 


117389 






3.42_ 


.AGGGraTTTTCArrCITrCTATCG-CCATGCTGACGTGAATOT 
* IIIII ■ IIII II IIIII 1 llllll II 1 1 II 1 II 


-398- 


30 


S: 


117390 


1 IIIII 1 IIII II IIIII 1 llllll III III III 
AACACTTTTTO-TACTTTAAATATTCCy^TGAAAA-QTGAATTT-TTATCCXSTACTACGA- 


117445 




Q: 


399 


TOIGOV-ATGAAATGGGTGTTTAGTACTGAAAAGGCATAAAATTAAAAATA-ATCATATG 
lit 1 III 1 1 1 IIII 111 II 1 III 1 II 1 III II iiii 


456 


35 


S: 


117446 


III MM II IIIIIIM III MM II MM II IIII 

TCATTATATGTA-TCACTATTTACTACATAATATATATATATATATATATATAT-ATATA 


117503 


Q: 


457 


TTTAAAATCCCTTAACATTTTTGTTT-CA-AAAATGATTAAGGAACTCTTCATACTGTTT 


514 . 




S: 


117504 


Mill 1 IIII III III IIII M IIII II IIII III 

TATATA-TATATATATATATATGTATACATAAAAGGAA-AAGGTAATQAAAATATT-TTT 


117560 


40 


Q: 


515 


CCTT-TTGTTTCCAT 528 






S: 


117561 


1 1 M III II 

CTTAATTATTTAAAT 117575 (SEQ ID NO: 23) 




45 


Score « 266 (46.0 bits), Expect « 0.33, P « 0.28 

Identities = 150/253 (59%), Positives « 150/253 (59%), Strand » Plus / Plus 




Q: 


265 


ATAAACAAAAATATATTTATATATATATATTTATGTAACTACTATGTGCTTTAAAGAAAA 

Mill III 1 MM IIIIIMIIMII III II 11 III 1 llllll 1 1 

ATAAATAAATAAATATATATATATATATATATATATATATA-TATATA-TTTAAAATATA 


324 


50 


S: 


120887 


120944 


Q: 


325 


TTACTGTATGATTCAGCAGGGTTTTTTCATTCTTTCTATCGCCATQCTGAOT^ 

II 1 III 1 1 1 MM III IIII II III 1 III II 1 

TTTTTTTATTTTATTTTATTTTATTTTTATTTTTTTTTTIGGTGTGCXjTATC 


384 




S: 


120945 


121004 


55 


Q: 


385 


TTCTTAGACAGGAATC-AGCAATGAAATGGGTGTTTAGTACTGAAA-AGGCATAAAATTA 

M M IMI M 1 III IIIIIM III M II 1 II M 1 

TT-TT-GACATQTACQTATGTAT-ATTTGGGTQTATA-T-CTAATAaiTACCTTTAAGCA 


442 




S: 


121005 


121059 


60 


Q: 


443 


AAAATAATCATATGTTTA7VA-ATCCCTTAACATTTTTGTTTCAAAAATGATO 


501 


S: 


121060 


Mil 1 111 III Ml 1 MM M i M II III II 

AAAACACACACACATATATATATATATATATATATATATGTGATATATT- TTATTTCATT 


121118 



no 



Q: 502 CTTCATACTGTTT 514 

II I I II II 
S: 121119 TTTAAGA-TGATT 121130 (SEQ ID N0:24) 

/ 

Score « 266 (46.0 bits), Expect =0.33, P = 0.28 

Identities « 230/409 (56%), Positives = 230/409 (56%), Strand « Plus / Plus 





131 


TGACAGAAAAATATTCTAATATTAAGGATTATTTTACAACTCTATGGAAGTAATGCAGTG 

1 III III Mil III 1 II Mill 1 1 nil 1 II 1 1 1 


190 


S: 


124694 


TTACATAAATATATATTTATGTGT6CTAT-ATTTTGTTAATATATGTATATATTCAAATT 


124752 


Q: 


191 


ATGCA-TCTTGCATCTGTTTTGTCTTGATGACA-AAACGCACTCTTAGAGTCACAAGATC 


248 


S: 


124753 


II 1 Mil 1 MM 1 II 1 1 III II 1 1 1 M 1 Mil 

ATTAAATCTTAAAAC-GTTGAA-CATAAAAAAATAAATCCAATTTQAAAGQAAAAAGAAQ 


124810 


Q: 


249 


CTGCCTTGTGTTAGTTATAAACAAAAATAT-ATTTATATATATATATTTATGTAACTAC- 


306 


S: 


124811 


1 III III 1 Mil II IIIIIIIIIIIII III II II 

AAAAAAAAAAAAAAAAATACACATATATATTATATATATATATATATATATATATATATA 


124870 


Q: 

t 


307 


TATGTGCTTTAAAGAAAATTACTGTATGATTCAGCAGGGT - TTTTTCATTCTTTCTATCG 


365 


S: 


124871 


III 1 1 II 1 III i 1 II M 1 1 Ml lllll M III 1 II 

TATATA-- TATATTrTATTTTATTTT - TGTTTGAATAAGGTATTTTTTTTTTTTTTTTTCT^ 


124928 


Q: 


366 


CCATGC-TGAC-GTGAATGGCTTCTTAGACAGGAATCAGCAATGAAATGGGTGm 


423 


S: 


124929 


Ml 1 1 III 1 1 1 1 1 1 1 11 Ml MM 1 III III 

A- ATGTATAATAGTGCGTAGAATATAAAAAATATATGTA- AATCAAATA" - T- TTT- GTA 


124982 


Q: 


424 


CTGAAAAGGCATAAAAT- -TA-AAAATAAT- CATATGTTTAAA- ATCCCTTAACATTTTT 


478 


S: 


1249B3 


II 1 II III 1 1 II IIIMI 1 lllll 1 II 1 II 1 1 II II 
-TGCATAG-GATTTATTAATATAAAATATTACATATATATATATATATATATATATATAT 


125040 


Q: 


479 


GTTTCAAAAATQATTAAGGA- ACT - CTTC A- TACTGTTTCCTTTTGTTT 524 
Mil 1 M 1 1 M 1 1 1 1 1 1 t 1 1 1 1 1 M 1 1 1 




S: 125041 
NO: 25) 


Mil Jill J 1 1 1 1 1 1 1 1 1 1 M 1 1 M J 1 1 
ATTTGAGGAATAAAATTGGATATTACAAGAATAGTTTTTATTTTTTTTT 125089 (SEQ ID 


Score ^ 246 (43.0 bits), Expect « 2.7, P « 0.93 

Identities • 242/432 (56%) , Positives o 242/432 (56%) , Strand - Plus / Plus 


Q: 


133 


AOVGAAAAATATTCTAATATTA-AGGATTATTTTACyUVCrCTATGGAAGTAATGCAGTC 

Mil liiii 1 i li 1 1 li III mil III III III 1 1 

ATATATATATATTTTTAftATAACAAAATQATTA-ACAACG-TATC- -AGT- -TQCCATAA 


191 


S: 


126899 


126952 


Q: 


192 


TGCATCTTGCATCTaTTTTaTCTTQATaACAAAACGCACT- CT- TAOAGTCACAAOATCC 


249 


S: 


126953 


1 1 III MM III 1 1 11. i 1 li li i nil 1 

CCCTTGTTGGTT-TATTAGCACTTTTTAA-AAGTGGTTATACrGTATACTCACGTAAGTT 


127010 


Q: 


250 


TGCCTTGTGTTAGTTATAAACyUAAATATATTTATATATATATATTTATGTAACTACTAT 

1 II II mill 1 iiiiiiii iiiiiiiimii III II 1 III 

TAAAAAGTAATA--TATAAAXATAAATATATATATATATATATATATATATATATT-TAT 


309 


S: 


127011 


127067 


Q: 


310 


GTGCTTTAAAQAAAATTACTGTATQATTCAGCAGGGTTTT-TTCATTCTTTCTATCQCCA 

1 Nil nil 1 III III 1 II II iiimi II n i 

TTA-TTTATTTT^AATTTTTTTAT-ATTAA--AGATTTGAGTTCATTCyUUVCT-Ta^G 


368 


S: 


127068 


127122 


Q: 


369 


TGCTGACGTGAATGGCTTCT-TAGACAGGA-ATCAGCAATGAAATGGGTGTTTAGTACTG 


426 


S: 


127123 


II 1 II 1 II MM II II 1 1 II 1 M 1 II II 

TGATAACATTGATAACTTTTCTATACCGCACATG-GTAA-GCT-TAAGCAAAAAAAAAAA 


127179 


Q: 


427 


AAAAQQCATAAAATTAAAAATAATCATATGTTTAAAATCCCTTAACATTTTTaT-TTCAA 

nil Ml 1. II 1 II Mill 1 II II IMIII 1 II 

AAAAAAAATATATATAGTACAAAQ- AGATATTGAGA- TAGATTTTTATTTXTATATTTGT 


485 


S: 


127180 


127237 



111 



JVO 01/52616 - PCTAIS00/351SM) 

Q: 4B6 AAATQATTAAGQAACTCTTCATACTGTTTCCTTTTGTT-TCCATCAQCCCAOAaCA-A-- 541 

Mil mil I I II III! II Mil I I II II III II 

S: 127238 AAAT-ATrAAa^TATGTTTATATTTTATAATTrTTrAGTTAATT-GCGCAA^ 127295 

5 Q: 542 CTGTGGTTTCAT 553 

INI III M 
S: 127296 CTGTACTTTAAT 127307 (SEQ ID NO: 26) 

Score - 239 (41.9 bits), Expect - 5.6, P « 1.00 
10 Identities t» 197/346 (56%) , Positives o 197/346 (56%) , Strand = Plus / Plus 

Q; 127 TAAGTGACSVGAAAAATATTCTAATAT-TAAGOAT-TAT-TTTACAACTCTATGGA-AGTA 182 

II II I I III I II Mill II II III nil I III! I I II 

S: 107911 TATaTAA-ATAAATAAATAAAAATATATATATATATATATTTATATTTATATTTACATTA 107969 



15 



55 



Q: 183 ATGCAGTGATGCSlTCTrGCATCT-GTTTTGTCTTQATGACaAAACSCACTCTTAGAGTCA 241 

II i II II I II I mil I III I I I I III I II III I 

S: 107970 ATATTTTTATTTATATATTATATCGTTTTAT-TTGTTAATATATTGCATACATA-AGT-A 108026 



20 Q: 242 CS^QATCCTGCCTTGTGTTAGTTATAAACAAAAAT-ATATTTATATATATATATTTATGT 300 

I II I III III I III im I II iiiiiiiiiiiiiii 1 I I 

S: 108027 TATAATATTATATTTTTTTAAACA-AAATAAAATTTATGTTTATATATATATATATCTTT 108085 
Q: 301 -AACTACTATGTGCTTTA-AAGAAAATTACTGTATGATTCAGCAGGGTTTTTTCAT-TCT 357 

25 II II I I II I II nil II II II II II I I I III I 

S: 108086 TAATTAATGT-TGAAAAATAAAAAAAATAAA-TAA-ATACA-CATATATATAT-ATATAT 108140 
Q: 358 TTCrATCGCCATOCTGACGTG A-ATGGCT TCTTAGA CAQGAATCAGCAATGAAATG GGTG 41_6__^ 

III I I I I I I III ill I 

30 S: 108141 ATATATATATATA-T-ATGTAATATC-CTTTTCACATATTATCCTTCAAGTTAATCTTTC 108197 
Q: 417 TTTAGTACTGAA-AAGGCATAAAATTAAAAATAATCATAT6TTTAA 461 

II I nil II II II II III II I II I II 

S: 108198 ATTT-TCTTGAATAAAACACCAACTTGGAAACAAATAAATAATAAA 108242 (SEQ ID 
35 NO:27) 

score = 237 (41.6 bits), Expect » 6.9, P « 1.00 

Identities = 147/246 (59%) , Positives = 147/246 (59%) , Strand = Plus / Plus 
40 Q: 265 ATAAACAAAAArATATTTATATATATATATTTATGTAACTACTATGTGCTTTAAAGAAA- 323 

I III iiiiiliiii miinm ii iii ii ii iii i i ii i i i 

S: 144839 AAAAAAAAAAATATATATATATATATAAATATATATATATA-TATATA-TATATATATAT 144896 
Q: 324 AT-TACTGTATGATTCAGCAGGGTTTTTTC-ATTGTTTCTATCGCCATGCTGACGTGAAT 381 

45 II II I III II I I I mm II II nil in i i i ii 

S: 144897 ATATA-TATAT-ATATAT-ATGTTTTTTTGTATGATTACTATTACAATAGTT-CTTAAAA 144952 
Q: 382 GGCTTCTTAGACAGGAATCAGCAATQAAATGGGTGTTTAGTACTGAAAAGGCATA-AAAT 440 

M Mil III I III II M II II I II II I nil 

50 S: 144953 AGGTAAACATACACTAATGATTTATCAT-TGTATATATAAAAGTTATAAA-CAAATAAAT 145010 
Q: - 441 TAA-AAATAATCATATGTTTAAA-ATCCCTTAACATTTTTGTTTCAAAAATGAT-TAAGG 497 

II MMil I II llll II I llll II I I III II Ml 

S: 145011 AAATAAATAAATAAATATATATATATATATATATATATATATAT-ATATAT-ATATAATA 145068 



Q: 498 AACTCT 503 
II II • 

S: 145069 TACACT 145074 (SEQ ID NO: 28) 



60 Score = 234 (41.2 bits), Expect = 9.5, P = 1.00 

Identities o 210/363 (57%), Positives = 210/363 (57%), Strand » Plus / Plus 



112 



Q: 116 ACCACTTTTGCTAAGTGACAGAA-AAATATTCrAATATTAAGGATTATXT-TACAACTCT 173 

Mill II II I II I llllll mil I Mill IMM I I 

S: 132573 ACCACATTATATATGATATATTATAAATATA-TAATAAAATTTATTATATATATAAATAT 132731 
5 Q; 174 -ATGGAAGTAATGCAGTGATGCATCTTGCATCTGTTTTGTCTTGATGACAAAACGCACTC 232 

II I II II II MM II II M I M I I II II Mill 

S: 132732 TATAAAACT--TGAAG-GATGAAT--TGAAT-TATTAT-TATTTATCTTTTCACGCATAA 132784 
Q: 233 T--TAGAQTC^CAAGATCCrQCCTTQTGTTAGTTATAAACAAAAATATATTTATATATAT 290 

10 I II I I II I I Mil I Mil I I I llllll lllllllll 

S; 132785 TAATATAAAAAAAAAAAAATAAACCATGTT-GATATATATATATATATATATATATATAT 132843 
Q: 291 ATA-TT-TATGTA--ACrACTATGTGCTTTAAAGAAAATTACT-GTATG-ATTCAGCAGG 344 

III II llllll III III I II I I II M MM III I II 

15 S: 132844 ATAATTATATGTATGAATAAAATATA-TATATATATTAT-ACAAGTATTTATTTAATATG 132901 
Q: 345 GTTTTTTCATTCT-TTCTATCGCCATGCTGACGTGAATGGCTTCTTAGACAGGAATCAGC 403 

llllll I I II Ml III I II I II I I I I I II I 

S: 132902 CTTTTTTTTTATTATTAAATC-CCAA--T-ACIATAAAAATATAAT-ATATATATATTAT- 13295S 



20 



40 



45 



50 



55 



60 



Q: 404 AATGAAATGGGTGTTTAGTACTGAAAAGGCATAAAAT-TAAAA-ATA-ATCaTATGTTTA 460 

III I II 1 I II II I I llll ll ll l -111- M llllll I 

S: 132956 AAT-ATATATATATATAATATATATATATAATAATATATATATTATATAT-ATATGTAGA 133013 



25 Q: 461 AAA 463 
III 

S: 133014 AAA 133016 (SEQ ID N0:29) 

Score = 137 (26.6 bits), Expect = 6.4, Sum P(2) « 1-00 
30 Identities = 119/196 (60%) , Positives = 119/196 (60%) , Strand = Plus / Plus 



35 



Q: 


121 


TTTTGCTAA-GTG-ACAGAAAAATATTCTA-ATAT-TAAGGATTATT-T-T-ACAAC-TC 


172 


S: 


83805 


MM Ml II 1 1 1 MM III 1 II II III 111 1 1 1 III II 

TTTTAGTAATGTTCATATATTTATATACTATAAATATATQGACCATTQTGTOATAACATC 


83864 


Q: 


173 


TATGGAAQTAATGCAQTQATQCATCTTGCATCTGTTTTGTCTT- -GA- -TGAC-AAAACG 

III 11 Ml 1 11 11 II Ml Mil 1 II 11 1 1 MM 

- ATGCAA- TATTTAAAC- ATA- ATTTTTTTTCT- TTTTCTTTTAAGAACTTTCTAAAAAA 


227 


S: 


83865 


83919 


Q: 


228 


CACTCTTAGAGTCACAAGATCCTGCCTTGTGTTAGTTATAAACAAAAATATATTTATATA 

1 111 1 1 1 III 1 1 1 III 111 III 1 llllllllllllll 


287 


S: 


83920 


AAAAATTACA- TAAAAAGTTTTTATCCA- T - TT - - TTA- AAATATTAATATATTTATATA 


83973 


Q: 


288 


TA-TATATTTATQTAA 302 

ir II II 1 Ml! 

TAATACATACAAGTAA 83989 (SEQ ID NO: 30) 




S: 


83974 




Score s 127 (25.1 bits), Expect = 0.31, Sum P{2) = 0.26 

Identities « 131/242 (54%), Positives = 131/242 (54%), Strand » Plus 


/ Plus 


Q: 


1034 


TTTATTATGATTTTTTTTTTCTGATCCCTTTGCAACCCTGC-ACCTAAGCCAAAAGCATT 

III II 1 IIIIIIMIllI 1 1 III 1 1 1 11 1 111 1 1 

TTTTTTTTTTTTTTTTTTTTCTTTTTCTTTTTTTATGAGACTAATTATGTATAAAC^ 


1092 


S: 


72397 


72456 


Q: 


1093 


ATAATCTTGTCATACTTCAGA-TAAGTCCACGGGAGATGTTCCGAGTGAACTATAG-ATG 


1150 


S: 


72457 


II MM II llllll 1 1 1 1 1 III 11: llllll II 

TTA- - CTTGA- ATTGTTCTIGAATTTTTTAAAGATATTTTTTATQTTTTCACTATATTATT 


72513 


Q: 


1151 


ACATTCCACTAG6GAATTCTATGTTCAGTGTAAATGGTATCTTGTATAAGTTTTAGTTTT 

III 1 II 1 1 II 1 M III 1 II 1 III II II III 
TCATAGCTGTAATATAATATAATATAA-TATAATCAAT-TCATTTATTATTAATATTTTQ 


1210 


S: 


72514 


72571 



113 



wo 01/52616 ^ ^ - ^ PCT/USOO/35190 

Q: 1211 TTGTTTACCCTTTGT-"TTCCTGGGCTGAQC-T"TGTCaVQAAATCTTGTCrrCTTCAGG 1266 

Mil Hill II III II MM mil III II II I 

S: 72572 ATATATOlCATTTGTGATTTQTGTG-TGCACGTATTTACAQAACyUlATGTATT-TO 72629 

5 Q: 1267 CT 1268 
II 

S: 72630 CT 72631 (SEQ ID N0:31) 

Minus Stremd HSPs: 
10 Score B 266 (46,0 bits). Expect = 0.0079, Sura P(2) a 0.0079 

Identities « 334/610 (54%) , Positives = 334/610 (54%) , Strand = Minus / Plus 

Q: 841 TGTGTGGCyulAAGGATGTTGTTTTOCTGGTCTAGATC-CATCTGTACCAACaAGTTC^ 783 

II III I II Mil I II III III II II II II Mini Mill 

15 S: 100142 TGAGTGTCTAATGCATGAT-TATCTCTTTTCTTTATrrTCAGCTATATCAACAA — TCATC 100198 
Q: 782 ACTTTAC-AGAACGAATCTTTTTATCCGTACAGGAGGT-TaUACCATGTCTGCCTCTTC 725 

III IMlll I MM III I M I II I I 11 I 

S: 100199 GTTCTAATAGTAG?VAAGATATATATTATTCAAAGCCCTCTGATAAAATATTTTTTTCG-C 100257 



20 



Q: 724 CTTTG-TAATGAATGAC-CTTTCTATGAGCTGTGACAAAATTTCCGAACAATTA-G-CTA 669 

II I III III II. M II I II INI I II III I I III 

S: 100258 TTTAACTTTTGA-TGAAACGTAATTTAAGTTTTTATQAAATATTAAAA-AATGAAGACTA 100315 



25 Q: 668 AGGATTTGGGAAGAGGGGGTGGCATIACGGGGCTTTCTGTTTTCCTGCCTCT^GCATG 609 

nil II I II I Ml I II I I I II I I 

S: 100316 GTAATTTTC-ATTAACCTTTATATAATACG--TATCA-TCCTCATATAT-ATAATATATA 100370 



Q: 608 CATCTGATTTATGCTTTATGGA-AGCCTTACCTCCAATCCCCAACTGTTAAQTCCCATGA 550 

30 )| I II III I I III III I II III Ml II 

S: 100371 TATAT-ATATATATTGAAACGAGATAATTAT-TAATATTATAAAATTTGAAAATT--T-A 100425 
Q: 549 AACCACAGTTGCTCTGGGCTQATGGAAACAAAAGGAAACAGTATGAAGAGTTCCTTAATC 490 

III MM II II I Ml MM III III IMlll III 

35 S: 100426 AACGACAATATCACAAGGTAAAAAAAAAAAAAAATAAA-A-TAA-AATAATTTCAAA^ 100482 
Q: 489 ATTTTTGAAACAAAAATGTTAAGGGATTTTAAACAT-ATG-AT-TATTTTTAA-TTTTA- 435 

\ M MM I II I I Mill llllllllllllllllllll 

S: 100483 AAATATTAAA-ATATATATATATATATTATATATATTATGTATGTATATTAAAATTTTAC 100541 



40 



Q: 434 TGCC-TTTTCAGTACTA71ACACCCATT-TCAT--TGCTQA-TTCCTGTCTA-AGAAGCCA 381 

I MM Mil II MM I I IMlll IMlll I 

S: 100542 TTTTATTTTTATTT*TCTCCTGGGAATATCyVTAATAATAAGTTCCT-T-TACACAAAAAA 100598 



45 Q: 380 TTCACGTCAGCaiTG-GCGATAGAAAGAATGAAAAAACCCTGCTGAATCATACAGTAATTT 322 

II I II II M III II lIMM IIIMIII II II 

S: 100599 ATAATTAAAATGTGTGCAAAAAAAAAAAAAAAAAAAAAAAAAAATATCATACA-TA-ra 100656 
Q: 321 TCTTTAAAGCACA-TAGTAQTTACATAAATATATATATATAAATATATTTTTQTTTAT-A 264 

50 I III II Mill II III IIMIIIIIMII IMlll II I III I 

S: 100657 TATATAT-GTATAATA-TATATATATATATATATATATATATATATATAATTATATATCA 100714 
Q: 263 ACTAACACAA 254 

II II iiir 

55 S: 100715 ACAAA-ACAA 100723 (SEQ ID NO: 32) 

Score = 242 (42.4 bits), Eacpect = 4,1, P = 0*98 

Identities = 242/430. (56%), Positives « 242/430 (56%), Strand « Minus / Plus 

60 Q: 538 CTCTGGGCTQATGGAAACAAAAGGAAACAGTATGAAGAGTTCCTTAATCATTTT^ 480 

Nil I II III III I Ml II II I III III II I II 

S: 110619 CTCTTTTATTATTTAAAGAAACATACAOU^CATTCATAATATAACAATTATTATO 110678 



114 





Q: 


479 


CAA- A-AATGTTAAGGGATTT- TAAACATATGATTATTTTTAATT TTATGCCTTTTC 

1 i nil III 11 1 III nil iiiiiiii 1 III mill 

TTATATAATOATAAAT-ATATATAATAATATTTAAATTTTTAAATAAGTTAATCCTTTTA 


426 




S: 


110679 


110737 


5 


Q: 


425 


- AGTACTAAACACCCATTT" - CATTGCTGATTCCTGTCTAAGAAGCCATTCACGTCAGCA 


369 




S: 


110738 


1 1 III 1 Mil III 1 II II II II 1 11 i 1 

GA7VAAATAATAAAAAATTTGGCATCAATTTTTAAAAAGGAAAAATT-ATAAATGTA 


110796 


10 


Q: 


368 


TGGCGATAGAAA-GAATGAAAAAACCCTGCTGAAXCATACAGTAATTTTCTTTAA-AGCA 


311 


S: 


110797 


1 1 II III 1 1 1 1 II III III rill 1 Mill 

TTAAG-TATAAATGTCTAATACAAATAAAAAAAAT-ATATA-TAAATAAATAAAAGAAAA 


110853 




Q: 


310 


CATAGTAGT- TACATAAATATATATATATAAATATATTTTTGTT- TAT- -AACTAACACA 

III 1 1 lllllllllllllllll 1 III II III 1 III III II 1 1 

AATACAAATATACATAAATATATATATTTTAATCTACATTTATAATATGAAACAAATAAA 


255 


15 


S: 


110854 


110913 




Q: 


254 


AGGCAGGATCTTGTGACTCTAAGAGTGCGTTTTGTCAT-CAAGAO^AAAC^ 

1 Mil III 1 1 1 I Mil 1 1 III 1 nil III Ml 

AAAAAQGAAAAAAATT-TCTTA-ATTTCCTTTTTCCTTACAAAAAAAAAAA-ATAAAA-A 


196 


20 


S: 


110914 


110969 


Q: 


195 


TGCATCACTQCATTACTTCCATAGA-GTTGTAAAATAATCCTTAATATTAGAATATTTTT 


137 




S: 


110970 


1 nil .11! 1 i III ill llllll i III II l lllll 1 

TAAATCAAA--ATAAAGAC-A-AGATGTTA-AAAATA-TA-TTA-TACCTTA-TATTTAT 


111020 


25 


w • 


X J u 








S: 


111021 


M II ill 

CTATAATTTA 111030 (SEQ ID NO: 33) 





Score - 234 (41,2 bits), Expect « 7.4, Sum P{2) - 1.00 
30 Identities « 146/243 {60«f) , Positives « 146/243 (60%), Strand « Minus / Plus 

rrCGTTAAT- CATT - TTTGAAACAAAAATGTTAAGGGATTTTAAACATATGAT - TATTTT 443 

II Mill Ml III I II II I MM I I II I nil I III I 





Q: 


499 


35 


S: 


34289 




Q: 


442 




S: 


34348 


40 


Q: 


387 




S: 


34407 


45 


Q: 


329 




S: 


34465 




Q: 


269 


50 


S: 


34521 



I III II II II I III II II II I I I I III III 

CACAT-ATTAGTTTCAGCATATGAATATTAACAAATGTATATTTAATAAT-AATAATAA' 
AGTAATTTTCTTTAAAGCACATAGTAGTTACATAAATATATATATATAAATATATTTTTi 

I nil I MM I III II II Mi llllllllllll II llllll 



Score = 206 (37.0 bits) , Expect = 0.0079, Sum P{2) 0.0079 

Identities » 128/211 (60%), Positives =128/211 (60%), Strand » Minus / Plus 

55 Q; 318 TTAAAGCACATAG-TAGTTACATA-AATATATATATATAAATATATTTTTGTTTATAACT 261 

llllll M II llllll IIIIIIIIMIIII -llllll I I I MM I 

S: 109096 TTAAAGAQAATTTATACATACATATAATATATATATATATATATATATATATATATATAT 109155 
Q: 260 A-ACACAAGGCAG--G-ATCTTGTQACTCTAA-GAGTQCGTTTTGT-CATCAAGACAAA- 208 

^ I I I I III I 1 1 I II I II I II - I I I 1 1 1 I I -I 

S: 109156 ACSftAATATAGTAGTTGTATTTTGTTTAAATAATGAAAAAftATATCTGCATATATATATGT 109215 



115 



..WO,01/52616 . 



Q: 207 ACAQATGCAAGA-TGCATC-ACTGCATTACTTCCATAGAGTTGTAAAATA-ATCCTTAAT 151 

III II ill nil II IN I nil i II II III II I 

S: 109216 ACATATnTA(3AATaCACATAATAAATQATATTCATATATTTAATGCTTATATCATT--T 109273 
5 Q: 150 ATTAGAATATTTTTC-TGTCavCTTAGCAAAA 121 

II nil iiiiiii i II III nil 

S: 109274 AT-AGAA-ATTTTTCCTTTCTCTTTAAAAAA 109302 (SBQ ID 110:35) 

Score a 196 (35.5 bite), Bisect » 0.021, Sum P(2) « 0.021 
10 Identities = 110/182 (60%) , Positives » 110/182 (60%) , Strand - Minus / Plus 



15 



20 



25 



30 



50 



Q: 


317 


S: 


113645 


Q: 


261 


S: 


113705 


Q: 


203 


S: 


113764 


Q: 


143 


S: 


113820 



nil I ill I II III iniiiiiiiiii iiiiii I II I niii i 

TAAATATAAATATTTATAATATATATATATATATATATATATATATATTATATATAAAC< 
TAACACAAQGCAGGATCTTQTOACTC--TAAaAGTaCGTTTTOTCATCAAaACAAAACA( 

nil! I nil 1 11 III II II I I 11 11 I III I 



II III I I I II nil I I II I inn ii iiiii i i ii 



Score a 184 ( 33.7 bits ). Expect = 0. 068, Sum P (2) « 0 .066 



Identities » 136/230 (59%), Positives - 136/230 (59%), Strand - Minus / Plus 
Q: 331 ACAGTAATTTTCTTTA-A-AGCACATAGTAGTTACATAAATATATATATATAAATATATT 274 

I III I IIIIIII I Mill II II Ml I IIIII III MM IIIIIII 

S: 116493 ATAGTTACAATCTTTATATAATACATATTATATATATATATATATATATATATATATATT 116552 
35 Q: 273 TTTOT-TTATAACTAAGAOEl-AGGCyVOaATCTTGTOACTCTAAC^^ 216 

M Mi M llll I Mill II MM M M 

S: 116553 TATTTATTTTAGCTAATAAGGAGTTAAA7VATTAAAGAGG--AAGAAAAAATTAAATTAGA 11661*0 
Q: 215 -AAGACAAAACAGATGCAAGATGCATCACTGCATTACTTCCATAGAGTTQTAAAA-TA^^ 158 

40 II I IIIII Milll III I II II Ml IIIII MM 

S: 116611 GAATATAAAA-ACATGa^--TG-AaAA--QAAA-ACaW3aAQAAGAAAA-TAAA^ 116662 

Q: 157 CCTTAATATTAG-AATATTTTTCTGTCACTTAQCAAAAGT-GGTTOVGTT 110 

I IIMI I llll II III IIIII lllllllll 
45 S: 116663 CAA-AATATAAATAATAATATAAA-TGATGAAAAAAAAOAAGGTTCAGTT 116710 (SEQ ID 
NO: 37) 



Score a 181 (33.2 bits), Expect - 0.092, S^m P(2) m 0.088 

Identities = 107/176 (60%) , Positives = 107/176 (60%) , Strand « Minus / Plus 
Q: 309 ATAGTAQTTACATAAATATATATATATAAATATATTTTTGTTTATAACTAACACAAQQ<^ 250 

Ml II II III IIIIIMIIIIII MUM I I I MM M I II 

S: 108899 ATACTATGTATATATATATATATATATATATATATATATATATATATATAT-ATAATTAT 108957 

55 Q: 249 GGATCTTGTGACTCTAAGAGTGCGTTTTGTCATOWl-GACAAAAC-AGATG-CAAGATC 193 

I I I II ill I I I III II I II II llll I II III II 
S: 108958 TTTTATAGA-ATTATAATA-TCC-TTTAGTG-TAAATGAAAAAATTATATTTCAATATAT 109013 

Q: 192 ATCACTGCAXTACTTCCATAGAGTTGTAAA-ATA-ATCCTTAATATTAGAATATTT 139 

60 II M II I I Ml I 11 M III M Mil l I IIMI I 

S: 109014 ATGTATGTAT-ATGTGTATATA-TT-TACTTATAGATTTTTAAGAAAAAAATATCT 109066 
(SEQ ID NO:38) 

Score - 173 (32.0 bits), Expect - 0.20, Sum P(2) o 0.18 

116 



Identities = 109/187 (58%) , Positives = 109/187 (58%) , Strand = Minus / Plus 
Q: 317 TAAAGCa^a^TAQ-TAGTTACSlTAAATATATATATATAAATATATTTTTaTTTATAACT 259 

Mil I III II II iiiiiiiiiiiiiiiii mill I I I INI II 

S: 142969 TAAATAA-ATAAATAAATAAATAAATATATATATATATATATATATATATATATATATAT 143027 
Q: 258 CACAAGGC-AGGATCTTGTGACTCT-AAGAGTGCGTTTTGTOVTCAAGACAAAACAGATO 201 

I I II I i II III M I I I 1 II HIM M 

S: 143028 TT-ATGTCCATCikAAATAAATCTTTTAATAATTAATAGTATACT-AATTTAAAACC-ATT 143084 
Q: 200 CAAGATGCATCACTQCAT-TAClTCCyiTAGAGTTGTAAA-ATAATC-CTTAATATTAG;^ 144 

III II II I II II I III I I II I iiiii I I nil 

S: 143085 CAATATATATATATATATATATATATATATATATATATATATAATTACGTTATTTTTTTT 143144 

Q! 143 TATTTTT 137 
IIMIII 

S: 143145 TATTTTT 143151 (SEQ ID NO: 39) 
Score « 173 (32.0 bits), Expect « 0.20, Sum P{2) m 0.18 

Identities » 133/229 (58%) , Positives = 133/229 (58%) , Strand « Minus / Plus 



Q: 


324 


TTTTCTTTAA- AGCACATAGTAGTTACATAAA- TATATATATATAAATATATTTTTGTTT 


267 ' 


S: 


114139 


nil inn i ii in i i i ii i in iniini inn n i i i 

TTTTTTTTAATACCA-ATAAAAAT-ATATGTACTAT-TATATATATATATAATTATATAT 


114195 


Q: 


266 


ATA-ACTA-ACAOVAGGCAGGATCTTGTGACTCTAAGAGTGCGTTTT-GTCATCAAGACA 


210 


S: 


114196 


III Ml 1 III 1 M 1 1 1 II III 1 III 1 II Ml 1 

ATATAATATATACATAT-AT-ATAATATAAATCATAGAATTTTTTTCAGAAATATAGATA 


114253 


Q: 


209 


A-AACy^GATQCAAGATGCATCACTQCATTACTTCa^TAGAGTTGTAAAAT^^ 


151 


S: 


114254 


1 II 1 M 1 1 1 1 1 III III 1 IIIIMIM II 1 1 

ATAATAAATT- -ATTTTAAAAAGAAGACTACAAATATATAATTGTAAAATTATATATGTT 


114311 


Q: 


150 


ATTAGAATATTTTTCT-G-TCACTTAG-CAAAAGTGGTTCAGTTCATTG 105 




S: 


114312 


I f 1 ( 1 N M (Ml M Mill Mill Ml 

AT-ATATTTTTTTAAAAGGTCAAATAATCAASAGAATTTTATTTT-TTa 114358 (SEQ ID 


NO 


:40) 







Score a 170 (31.6 bits), Expect = 0.27, Sum P(2) = 0.23 

Identities = 106/181 (58%), Positives = 106/181 (58%), Strand = Minus / Plus 
Q: 315 AAGCACATAGTAGT-TACy^TAAATATATATATATAAATATATTTTTGTTTATAACTAACA 257 

II I III II 11 III IIMIIMIIIII IIIMI I I MM I I 

3: 114013 AATAATATATTATACTATATATATATATATATATATATATATATGCATATATATTTTAAG 114072 
Q: 256 CAAGGCAGOATCTTGTGACTCTAAGAGTGCGTTTTGTCATCAAGACAAAACAGATGC^ 197 

I I M II III I III I I 11 II M I I II I I II M 

S: 114073 CCTCCCTTTATATTTTGA-TATAAAATTTA-TTATAT-ATTATTATAATATATATTAAA- 114128 
Q: 196 ATGCATCACTGCATTACTTCCATAGAGTTGTAAAATAATCCTTAATATTAGA-ATATTTT 138 

I I II I I II II Ml I IIIMI I II HIM- 1 MM I 

S: 114129 AGGTAT-ATTTTTTTTTTTA-ATACCAATA-AAAATA-TATGTACTATTATATATATATA 114184 

Q: 137 T 137 
I 

S: 114185 T 114185 (SEQ ID NO: 41) 

Score B 169 (31.4 bits). Expect = 0.29, Sum P(2) = 0.26 

Identities = 43/53 (81%), Positives « 43/53 (81%), Strand » Minus / Plus 

Q: 316 AAAGCACATAG-TAGTTACATAAATATATATATATAAATATATTTTTGTTTAT 265 

III IIIMI II II III IIIIIIIMIIII IIIMI III Mill 

S: 135768 AAAACACATATATATATATATATATATATATATATATATATATATTTATTTAT 135820 
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WP01/52$lt^ — ^ PCT/USOO/35190 

Score ■ 168 (31.3 bits). Expect = 0.32, Sum P(2) >= 0.28 

Identities = 100/165 (60%), Positives = 100/165 (60%), Strand « Minus / Plus 

Q: 301 TACATAAATATATATATATAAATATATTTTTGTTTATAACTAAavajlGG^ 242 

S II llllllll lllllll llllll I I I Mil II I II II II 

S: 130342 TAAATAAATATTAATATATATATATATATATATATATATATATGTATATACAT--TCATG 130339 

Q: 241 TGACTCTAAGAGTGCGTTTTGTCy^TO^-AG-ACaAAACyVGATGCAAGATGC^^ 184 

I I III II III II I II I I I Mil I II IN III I I 

10 S: 130400 TTO-TATAAAAGCCCGTATTCrrATQACATTATAAAAAA-ATTAAATAA-CATAATO 130456 

Q: 183 TT-ACTTCC-ATAQAGT-TGTAAAATAATCCTTAATATTAGAATA 142 

II III I II I I Mill II M MM I MM 

S: 130457 TTCACTACATATTTAACATATAAAAAAAAAATrTATAT-A-rAATA 130499 (SEQ ID 
15 N0:42) 

Score = 166 (31.0 bits). Expect = 0.39, Sum P(2) = 0.33 

Identities = 72/110 (65%), Positives = 72/110 (65%), Strand « Minus / Plus 

20 Q: 315 AAGCACATAGTAGTTACATAAATATATATATATA-AATATATTTTTGTTTATAACTAAC- 258 

llllll II III I IIIIIIIIIIMIII lllllll I III MM II 
S: 109420 ATGAATATA-TATTTA-A-AAATATATATATATATAATATATATATQTAQ-TAACAAAAT 109475 

Q: 257 ACAAGGCAGGATCTTGTGACTCTAAGAGTGCGTTTTGTCATCAAGACAAA 208 

25 I Ml I I II I II II llllll Mill II II 

S: 109476 AAAAGATATT-TATTATAATTATATTCTTCCATTATATGATCAATACCAA 109524 (SEQ ID 
NO:43) 

Sc'or^~=~16i T3~0^2~bits)V^3^ert = 0.64, Sum P(2) = 0.47 
30 Identities « 129/220 (58%), Positives = 129/220 (58%), Strand = Minus / Plus 

Q: 317 TAAAGCACATAGTAGTT-ACATAAATATATATATATAAATATATTTT-TGTTTATAACTA 260 

MM I MM I II I III III I IIIMII illllll I M Ml I I 

S: 126378 TAAACC-CATATTTTTTCATATACATAAAAA.TATATATATATATTATATATATATGAAAA. 126436 



35 



50 



Q: 259 ACACAAGQOVGGATCTTOTGACTCTA-AQA-QTQCGTTTTGTCATCAAGACAAAACAGAT 202 

I I II I II I I 1 I I I I II i I I II II inn I I II 

S: 126437 AAATAATT-ATTATATATT-A-TAAACATATGTACATAT-GT-ATAAATACATTATATAT 126491 



40 Q: 201 GCA-AGATGCATCACTGOlTTA-CITCCaTAGAGTTGTAAAATA-ATCCTTAATATTAGA 145 

MM III I II I III III llllll II INI II 

S: 126492' ATATATATATATAA-T--ATGAGCTTGAAAAAAAAAATAAAATTTATTAATATTAAAAAA 126548 
Q: 144 A-TATTTrTTC-TGrCACTTAGCAAAAGT^TTCAGTTC^ 107 

45 I I lllllll I I M I I I II I Mill 

.S: 126549 AATGTTTTTCAtTTTATTCAT-ATTATAGATTTA-TTCAT 126586 (SEQ ID NO: 44) 



Score B 161 (30.2 bits). Expect = 0.64, Slim P(2) = 0.47 

Identities « 127/220 (57%) , Positives =» 127/220 (57%) , Strand = Minus / Plus 
Q: 317 TAAAGCACATAGTAGTTACATAAATATATATATATAAATATATTTTTGTTTATA-ACTAA 259 

MM Mill I II III IIIMIIIIIIM llllll I I Mill I I I 

S: 119745 TAAAT-ACATAAOl- -TATATATATATATATATATATATATATATATATTTATTTATTTA 119801 

55 Q: 258 CACAAGGCAGGATCTTGTGACTCTAAGAGTQCGTTTTGTCATCA-AGACAAAACA^ 200 

III I II I Ml I II III II I III III I II 

S: 119802 TA-ATTTCCCCTTTTTTTCT-TCTTTTTTTT-GTCTTGA-ATTATAGAAA^ 119857 

Q: 199 AAGATGCAtCACTGCAT-TACTTCCATAGAGTTGTAAAATAATCCTTAATATT-AG^^ 144 

1 II I I II II I III I I.I Mill I II I II I I 
S: 119858 6TTCTT-ATAAATATATATATATATATATATATATTAAATATTATGTATTTTTTATTTTA 119916 



60 



lis 





Q: 


143 TATTT-TTCTGTCAC-TTAQ-CAAAAGTGGTTCAGTTCAT 107 




S: 


mil II 1 Ml III II 1 II Mil 1 II 
119917 TATTTATTTTTTCAAATTAATCATATATGCTTCATTATAT 119955 (SEQ ID NO: 45) 


5 


Score = 160 (30.1 bits), Expect « 0.70, Sum P(2) « 0.51 

Identities « 110/188 (58%), Positives o 110/188 (58%), Strand = Minus / Plus 






324 TTTTCTTTAAAG-CACATAGTAGTTACAT--AAATATATATATATAAATATAra 268 


10 




Nil III 1 MM 1 III 1 lllllllllllllll llllll 1 III 

147783 TTTTGTTTGCCTTCCCATAT^AATATACTTTTAAATATATATATATATATATATGTATGTA 147842 




w • 


267 TATAACTAACACAAGGCAGGATCTTGTGACTCTAAGAGTGCGTTTTGTCATCAAGACAAA 208 

1 II II II III 1 II 1 111 1 111 II II 1 1 

1 4 7 ft 4. 1 TOT^ITOTATOT - ACQTATTTATTTATTTAATAAAQGAATATQQTTTATCCCCACTATATT 147901 


15 


C! - 
a • 


Q: 


207 AC7VGATGC-AAGATG-CATCACTGCATTACTTCCATAGAGTTGTAAA--ATAATCCTTAA 152 

11 III II II II 1 II III nil nil 1) II 1 III II 

147902 -CACATGTTAATATTTCAAAA-TG-ATT-CTTC--ATATATTTATCATTCACAATTTATAC 147956 




S: 


20 


Q: 


151 TATTAGAA 144 


- 


S: 


• 1 III II 

147957 TTTTAAAA 147964 (SEQ ID NO:46) 


25 


Score - 157 (29.6 bits), Expect - 0.94, Sum P{2) -0.61 

Identities « 81/128 (63%), Positives « 81/128 (63%), Strand = Minus / Plus 




Q: 


315 AAGCACATAGTAGTTACATAAATATATATATATAAATATATTTTTGTTTATAAC-T-AAC 258 


30 


S: 


II inn II II III nnininni iininni i iiii i i ii 

113474 AATTACATA-TA--TATATATATATATATATATATATATATTTTTATATATATCCTTAAA 113530 


Q: 


257 ACAAQGCAGG-ATCTTGTG-ACTC-TAAGAGTGCGTTT-TGTCATCAAGA-CAAAACAGA 203 




S: 


II 11 II III 1 1 III 1 III 11 1 11 inn 1 1 

113531 ATATATCACACATTTOBTTCATTTATAATATAAAAACCATGTTATAATQAACAAAAAATA 113590 




Q: 


202 TGCAAGAT 195 




S: 


1 II II 

113591 TTAAAAAT 113598 (SEQ ID NO: 47) 



Score a 157 (29.6 bits), Expect = 0.94, Sum P{2) » 0.61 
40 Identities = 137/236 (58%), Positives = 137/236 (58%), Strand = Minus / Plus 





Q: 


309 


ATAGTAGTTACATAAATATATATATATAAATATATTTTTGTTTATAACTAACACAAGGCA 250 

i 1 IIII niii iiiiiiiiiiiii n 1 II 1 II 1 11 1 1 III 

AAATTAGTCACATATATATATATATATATATG- ACATTGGGGG- TA- CGAAAAATATGCA 117850 


45 


S: 


117794 


Q: 


249 


GGATCTTGTGACTCTAAGAG- TGCGTTTTGT- CATCA- AGACAA- AACAGATGCA- AGAT 195 




S: 


117851 


III 1 1 IIII 1 II n 1 III 1 1 III nil II IIII 

X-AT- -TAAAAATATAAAACATATGTATTATACATAAGATACGACAACAAATATATATAT 117907 


50 


Q: 


194 


QCATCACTGCAT-TACTTCCATAGAGTTGTAAA- ATAA-TCCTTAATATTAGA- ATATTT 13 9 

IIII III! 1 IIII 1 11 Mil 1 1 .11 1.1 11 1 nil 1 

ATAT- A- TATATATATATATATATATATATATATATATGTACATATTACTACATATATGT 117965 




S: 


117908 


55 


Q: 


138 


TTCTGTCACTTAGCAAAAGTGGTT-C-AGTTCATTGCCGC-GCCCATCATGTTCTT 86 


S: 


117966 


II 1 1 1 II 1 1 1 II 1 1 nil 1 1 III 1 1 11 III 

TTATAT- ATTTTT- ATTATTTCTTACTAAATCATCACATCCGCCGACAACGT^ 118019 



(SEQ ID NO: 48) 



Score » 156 (29.5 bits), Expect « 1.0, Sum P(2) - 0.65 
60 Identities » 122/201 (60%), Positives = 122/201 (60%) , Strand » Minus / Plus 

Q: 318 TTAAAGCACATAG-TAGTTACATAAATATATATATATA-A-ATATATTTT--TGTT-TA-T 265 

I III I 111 II I I lllllllllllllllll I lllllllll I II II I 
S: 130866 TAAAAAAATTTAGGTA-TAATATAAATATATATATATATATATATATTTTATATTGTACT 13 0924 

119 



, wo 01/52616 .... .. .. -.- ECT/USOO/35190 





Q: 


264 


AACTTUICACAAGG- »G<3ATCT-TGTGACTCTA- AGAOTGCGTTTTGTCATC-AAGAC^ 

1 II III II- II II III ill 1 1 1 INI II II 1 M 

TATTATTTTTAGGGO^T-ATiUATG-GaATATATAAACTTT-TCTTGTTATATJ^^ 


209 




S: 


130925 


130981 


Q: 


208 


AACAGATGCAAGATGCATCACTGaVT-TACTTCaVTAGAGTTGTAAAA-TA-ATCC 

Mil MM 1 It MM 1 Mil 1 M 1 MM III 
II 1 1 II 1 1 1 11 Mil 1 II 1 1 1 II 1 II 1 1 Ml 

AAAATAATAAATATAAAATA-TA-ATATATATATATATATATATATATQTATATATTTA- 


152 




S: 


130982 


131038 


10 


Q: 


151 


TATTAdAATATTTTTCTGTCA 131 






S: 


131039 


1 III i III II 1 III 

T-TTATA-TATGTTG-TATCA 131056 (SEQ ID NO: 49) 




15 


Score » 154 (29.2 bits], Expect 1.3| Sum P(2) s 0.72 

Identities = 88/143 (61%) , Positives = 88/143 (61%) , Strand - Minus / 


Plus 




Q: 
S: 


309 
111140 


ATAGTAGTTACATAAATATATATATATAAATATATTTTTGTTT-ATAACTAAOICAAGGC 

III II II III iiiiiiiiiiiii iiiiiiii III III III nil 1 

ATA-TATATATATATATATATATATATATATATATTTATGT(3ACA.TATTTAA-ACAAACC 


251 

111197 


Q: 


250 


AGaATCTTGTaACTCTAAQAQT(K!OTTTTaTCATCAA{aACAAAACAaATO 


191 




S: 


111198 


1 1 llll III II MM ill 11 11 11 11 1 III 

AACAA-T-GTTAA AATATTTCCTTTT--CATTTATAGCATATT^TATAATTT-CAT 


111249 


25 


Q: 


190 


CACTGCATTACTTCCATAGAGTT 168 






S: 


111250 


II II II II II III 
AAATCCACGACCTCTTTATTGTT 111272 (SEQ ID NO: 50) 




30 


Score «- 152 (28.9 bits), Expect « 1.5, Sum P{2) *» 0.78 

Identities » 46/64 (71%), Positives « 46/64 (71%), Strand « Minus / Plus 




Q: 


317 


TAAAGCyiCATAG-TAGTTACATAAATATATATATATAAATATATTTTOQTTTATAACT 

llll II llll II Ml IIIMIIIIIIII lllll 11 i 1 llll III 

TAAACCATGTTGATATATATATATATATATATATATATATATAATTATATGTATGAATAA 


259 


35 


S: 


132804 


132863 


Q: 


258 


CACA 255 






S: 


132864 


1 1 

AATA 132867 (SEQ ID NO: 51) 





40 Score = 150 (28.6 bits). Expect 1.8, Sum P(2) =0.84 

Identities s 42/54 (77%), Positives = 42/54 (77%), Strand « Minus / Plus 



Q: 316 AAAGCACATAQ-TAQTTACATAAATATATATATATAAATATATTTTTGTTTATA 264 

II mill II III III iiiiiiiiiMii null i i i iiii 

45 S: 124826 TkATACACATATATA-TTATATATATATATATATATATATATATATATATATATA 124878 (SEQ 
ID NO: 52) 

Score « 150 (28.6 bits), Expect 1.8, Sum P{2) « 0.84 

Identities = 100/171 (58%) , Positives = 100/171 (58%) , Strand = Minus / Plus 

50 



Q: 


300 


ACATAAATATATATATATAAATATATTTTTGTTTATAACTAACACAAGGCAGG-ATCTTG 

1 1 llllllllllillll IIMIIIIII II t 1 1 1 11 III 

AAAAAAATATATATATATATATATATTTTTAAATAACAAAATGATTAACAACGTATCAGT 


242 


S: 


126885 


126944 


Q: 


241 


TQACTCTAAGAGTGCGTTTTGTCATCAAGACSVAAACAGATGCaVAQATGCATCA^ 

II 1 III 1 III 1 M 1 11 III II il MM II 

TGCCA-TAACCCTT-GTTGGTTTATTAGCACTrrTTAAAAGTG-GTT--AT-ACTGT^ 


182 


S: 


126945 


126997 


Q: 


181 


ACTTCCATAGAGTTGTAAAA-TAATCCTTAA-TATTA6AATATTTTTCTGT 133 

III 1 II llll llll llll III III 1 llll 1 1 II 
ACTCACOTA- AGTTTAAAAAGTAATATATAAATATAAATATATATATATAT 127047 




S; 
NO 


126998 
:53) 


(SSQ ID 



Score = ISO (28. S bits), Ejqpeot = 1.8, Sum P|2) =0.84 

120 



Identities = 138/240 (57%), Positives « 138/240 (57%), Strand = Minus / Plus 
Q: 332 TACAGXAATTTTCTTTAAAGCACATAGTAGTTACATAA--ATATATATATATAAATATAT 275 

II IIIIIIIM ill I I III II I II llllllllll IIMII 

S: 145182 TAAAQTAATTTTTTTTCTTTTTTTTrrT--TXAAATGATaATTTATATATATATATATAT 145239 
Q: 274 TTTTGTTTATAACTAACACAAG-GCAGGATCTTGTGACTC-TAAGAGTGCGTTTTGT^ 217 

I I I I II III I III nil I I II Ml II I III 

S: 145240 ATATATGGAAAAAAAAAATU^TTTGCAT-ATCTAAATATACATAGGCGTAAATTGTACCAT 145298 

Q: 216 Cy^-AGACA-AAACAGATGCf^GATOCUVTCACTGCATTACOT 160 

I I I lllll J II I 111 I I II III I I I III III 
S: 1452 99 TTTAAAAATAAACAATTAGAACyUA-AT-ATTATAATATATACA-A-A-TAATA^ 145353 

Q: 159 ATCCTTAATAT-TAGA-ATATTTTTCTGTC--ACTTAG-CAAAAGTGGTTCAGTTCATTG 105 

I lllll II I Mllllll M MM Mill III I i MM 

S: 145354 AAGAAGAATATGTATATATATTTTTATTTTTTACATATTCAAAA-TAGTGTATAT-ATTG 145411 
(SEQ ID NO: 54) 

Score « 148 (28.3 bits), Expect = 2.2, Sum P(2) « 0.89 

Identities = 38/47 (80%), Positives = 38/47 (80%), Strand = Minus / Plus 



Q: 311 ACATAGTAGTTAC7VTAAATATATATATATAAATATATTTTTGTTTAT 265 

I Ml II II III IMIIIIIIIIII llllll III Mill 

S: 127026 AAATA-TAAATATATATATATATATATATATATATATATTTATTTAT 127071 (SEQ ID 
NO: 55) 

Score = 146 (28.0 bits), Expect « 2.7, Sum P(2) = 0.93 

Identities = 36/43 (83%), Positives » 36/43 (83%), Strand = Minus / Plus 
Q: 301 TACATAAATATATATATATAAATATATTTTTGTTTATA-ACTA 260 

II III IIIIMIIIMll llllll III llllll I II 

S: 133748 TATATATATATATATATATATATATATATTTATTTATATATTA 133790 (SEQ ID NO: 56) 
Score a 146 (28.0 bits), Expect = 2.7, Sum P{2) = 0.93 

Identities « 42/55 (76%), Positives = 42/55 (76%), Strand « Minus / Plus 
Q: 316 AAAGCAC-ATAG-TAGTTACATAAATATATATATATAAATATATTTTTGTTTATA 264 

MM I III II II III lllllllllllll Mllllll I I MM 

S: 134066 AAAQAAATATACCTATTTGTATATATATATATATATATATATATTTATTTATATA 134120 
(SEQ ID NO: 57) 

Score B 142 (27.4 bits), Expect » 4.0, Sum P(2) » 0.S8 

Identities o 40/53 (75%) , Positives « 40/53 (75%) , Strand a Minus / Plus 
Q: 315 AAGCACATAG-TAGTTACATAAATATATATATATAAATATATTTTTGTTTATA 264 

II I III M I I Ml IIIMMIIIIM llllll I III MM 

S: 130339 AAATAAATAAATATTAATATATATATATATATATATATATATATATGTATATA 130391 (SEQ 
ID NO:58) 

Score = 141 (27.2 bits), Expect = 0.037, Sum P(3) « 0.037 

Identities = 113/198 (57%), Positives = 113/198 (57%), Strand = Minus / Plus 
Q: 1227 AAACAAAGGGTAAACAAAAAACTAAAACTTATACA-AGATACCATT-TACACTGAAC^^ 1170 

III III III llllll III I ' MM I I III III Mill III 

S: 32462 AAAGAAAAAAAAAAAAAAAAA-TAATATATATATATATATATTATTATATA-TAATTATA 32519 
Q: 1169 GAATTCCCTAGTGGAATGTC-ATCTATAGTTCACTCGGAACATCTCCCGTGGACTTATC- 1112 

II II I Ml I II II M I I I III lllll 

S: 32520 TAGATAAATAATATAATCTTTATAAAAAATTAA-TA--ATCATAAAATGTTAAAATAAAA 32576 
Q: 1111 TGAAGTATGACAAGATTATAATGCTTTTGGCTTAGGTGCAGGGTTGCAAAGGGATC^^ 1052 

MM II I II I III I Mill III I I I Ml I I II 

S: 32577 TGAAATAA-A-AACACTATTTTTTATTTGGTTTAAATATAAAGAAAAAAAAAAA- -A-AA 32631 

121 



WO01/S2616 



PCT/USOO/35190 



Q: 1051 AAAAAAAATCATAATAAA 1034 

iiiiiiiii III nil 

S: 32632 AAAAAAAATTATATTAAA 32649 (SEQ ID N0:59)i 

5 

Score e 141 (27.2 bits). Expect » 4.4, Sum P(2> = 0.39 

Identities « 47/63 (74%), Positives = 47/63 (74%), Strand » Minus / Plus 
Q: 317 TAAAGCAOVTAGTAGTTACATAAATATATATATATAAATATATTTTTG^ 260 

10 Mil Mill I I mill IIIIMIIIIIII Mil I I I Mil 11 II 

S: 116113 TAAACAACATA-T-GATACATATATATATATATATATATAT-TATAAATATTATAATCTT 116169 
Q: 259 ACA 257 
15 S: 116170 AAA 116172 (SEQ ID NO: 60) 

score e 141 (27.2 bits). Expect « 4.4, Sum P(2) » 0.99 

Identities = 117/198 (59%), Positives = 117/198 (59%), Strand » Minus / Plus 
20 Q: 328 GTAATTT-TCTT-TAAAGCACATAGTAGTTACATAAA-TATATATATATAAATATATTTT 272 

Mill I II I I II III M II Mill IIIMIIlllil llllll II 

S: 145723 GTAATATATCATATTTTTOV-ATA-TATTTT-ATAAAATATATATATATATATATATATT 145779 
Q; 271 TGTTTATAACT-AACACAAGGCAGGATC-TTQTaACTCTAAQAGTGCGTTTTOTCATC^ 214 

25 INI I I II I II I II II I III II I II 11 Mil 

S: 145780 ATATTATGAATGAAAATAACT-ATQCTAATTAT-ACT-TATTAA-G-GACTTCT-ATAAT 145833 
Q : 213-GACyU^CaiGATGCAAGATGC^ATCACTGCATTACTTCCATAGAGraG -157^ 

Ml I I I I M II M II I I III I I I llllll I 

30 S: 145834 TTTTAAATATCTTAATGTGGCGATTTTAGC-TTCCC-CTATAAAAATAAATAAATAAATA 145891 

Q: 156 CTTAATATTAGAATATTT 139 

III II i llllll 
S: 145892 AATAA-ATAAATATATTT 145908 (SEQ ID NO: 61) 
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Score a 140 (27.1 bits), Expect « 4.8, Sum P(2) ss 0.99 

Identities « 48/67 (71%), Positives « 48/67 (71%), Strand - Minus / Plus 



Q: 317 TAAAGCACATAG-TAGTTACATA-AATATATATATATAAATATAT-TTTTGTTTATAACT 261 

40 MM MM II II II Ml Mill I II III M III Mill III II 

S: 145886 TAAATAA-ATAAATAAATATATTTAATATATATATATATATTTATATTTTCSATTAaU^ 145944 

Q: 260 AACACAA 254 
II I M 

45 S: 145945 AAAAAAA 145951 (SEQ ID NO: 62} 

Score a 137 (26.6 bits), Expect o 6.4, Sum P(2) « 1.00 

Identities « 39/52 (75%), Positives = 39/52 (75%), Strand - Minus / Plus 

50 Q: 311 ACATAGTAGTTACATAAATATATATATATAAATATATTTTTGTTTATAACTA 260 

I III II II III lllllllllllll III I III llllll M 
S: 135783 ATATA-TATATATATATATATATATATATATTTATTTATTTATTTATATTTA 135833 (SEQ 

ID NO: 63) 

55 Score 137 (26.6 bits). Expect « 6:4, Sum P(2) « 1.00 

Identities « 105/190 (55%) , Positives = 105/190 (55%) , Strand » Minus / Plus 

Q: 300 ACATAAATATATATATATAAATATATTTTTGTTTATAACTAACACAAGGCAGGATC^^ 241 

I III IIIIIIIIIMII MUM I M MM II I I III II I 

60 S: 109971 ATATATATATATATATATATATATATATATGAAGATAA-TATTATATAACAGTATTTACA 110029 
Q: 240 GACTCTAAGAGTGCGTTT-TGTCATCAAGACAAA-AC-AGATGCAAGATGCATCACTGCA 184 

I I II I I I I I Ml I I M I II II I I III 

S: 110030 TAATGTATCATTTaATATATTTCTTTTATTCaTATATTATATCAAAAAAACPAAAATTTA 110089 

122 



Q: 183 TTACTTCCATAQAGTTGTAAAATAATCCTTAATATTA-GAATATTTTTCTQTCACTTAGC 125 

„ ..III I I I I mill II I INI I I II II I Mill 

S: 110090 TTTATAGGTTTCAAATA-ACAATAATTCTGA-TAATATGTAAATGATTATA-CACTTTAT 110146 

5 

Q: 124 AAAAGTGGTT 115 

II I I III 

S: 110147 AATATTTGTT 110156 (SEQ ID NO: 64) 

10 Score = 136 (26.5 bits), Expect =7.0, Sum P(2) =1.00 

Identities ^ 70/110 (63%) , Positives = 70/110 (63%) , strand « Minus / Plus 



Q: 317 TAAAGCACATAG-TAGTTACATAAATATATATATATAAAT-ATA-TTTTTG-T-TTAT-A 264 

Mil IIMII II II III lllllllllllll II II Mill I II M 

ID S: 117660 TAAA-CACATATATATATATATATATATATATATATATATTATGCTTPTTTATCTTTTCA 117718 
Q: 263 ACTAACACAAGGCAGGATCTTGT-GACTCTAAGAGTOCGTTTTGTCATCA 215 

I I II I II I Mil I II I II II Mini 

S: 117719 ATTGACTAATCCCAATTT-TTGTAGTCTTTCTTTTAGCTTTCCTTCATCA 117767 (SEQ ID 
20 N0:6S) 

Score = 136 (26.5 bits), E^qoect « 7.0, Sum P (2) ' a 1.00 

Identities - 114/197 (57%) , Positives « 114/197 (57%) , Strand - Minus / Plus 
25 Q: 316 AAAGCACATAQTAGTTACATAAATATATA'TATATAAATATArrTTTGTTTATAACTAACA 257 

Ml I Ml II 11 III lllllllllllll llllll I II II III I 

S: 151968 AAATAAAATA-TACATATATATATATATATATATATATATATAACAATATAGAA-TAATA 152025 
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256 


CA- AGGC AQGATC - TTGTQACTCTAAGAQTGCQTTTTGTCATC - AAGACAAAAC AQATGC 

M llllll llllll II II II 1 nil Mil 

TATAATAAGAATTATTAAAAATATAACAAAAAATAATAATTTCTAAAATAAAA-AGAT- - 


200 


S; 


152026 


152082 


Q: 


199 


AAGATGCATCACTGCATTACTTCCATAGAGTTOTAAAATAAT-CCTTAATATTAGAATAT 


141 • 


S: 


152083 


1 1 II M 1 1 II 1 1 1 II II II II nil 1 1 II III 

A-G-TGAAT-A-TAAATAAATAA-A-AGCTGTATTCCATCATACCTTGACACCTGA-TAT 


152135 



40 



Q: 140 TTTTCTQTCACTTAaCa. 124 

nil I II Mil 

S: 152136 TTTT-TCCCAAACAGCA 152151 (SEQ ID NO: 66) 
Score « 135 (26,3 bits) , Expect = 7.7, Sum P(2) =: i.oo 

Identities - 37/48 (77%), Positives = 37/48 (77%) , Strand = Minus / Plus 



Q: 311 ACATAGTAGTTACATAAATATATATATATAAATATATTTTTGTTTATA 264 

N M llll Ml MIIIIIIMIM nil I III I III! 

S: 134076 ACCTATTTGT-ATATATATATATATATATATATATTTATTTATATATA- 134122 (SEQ ID 
NO: 67) 

Score = 134 (26.2 bits). Expect = 8.5, Sum P(2) a 1.00 
50 Identities « 78/121 (64%) , Positives - 78/121 (64%) , Strand - Minus / Plus 

Q: 317 TAAAGCACATAQ-TAGTTACaiTAAATATATATATATAAATATATTTTTGITTATAACTA- 260 

III! 1 III II II III llllllllllll II III I II III I II 

S: 109965 TAAT^AAATATATATATATATATATATATATATATATGAAGATAATATTATATAACAGTAT 110024 
Q: 259 --ACACAAGGCAGGATCTTQ-TG-ACT-CTAAGAGTGCGT-T-TTGTCATCAAGACAAAA 207 

Ml II I I II III I I I II II III I II I MUM llll 

S: 110025 TTACaiTAATCTATCAT-TTGATATATraCTTTTATT-CGTATATTAT-ATCAA-A-AA^ 110079 

60 Q: 206 C 206 

I 

S: 110080 C 110080 (SEQ ID NO:68) 



Score a 129 (25.4 bits), Expect 0.11, Sum P(3> = 0,11 

123 
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Identities - 211/393 (53%), Positives - 211/393 (53%), Strand - Minus / Plus 



Q: 1413 GAAGGCTTTATTGTATCaaAGaVGTGCTAAAATTTCTAGGACAGflACAACAC-CA-GTAC 1356 

• III mill MM II III III II II II II I II II I 

S: 38617 GAAAATTTTATTTCTTCATAT-ATTCATAATATTA-TATCy^TAAAAATGCTCACGTCC 38674 



Q: 1355 TGGTTTACAGOTGTTAAGACTAAAATTTTGC-CTATACCTTTAA-GAC-AAACAAACAAA 1259 

I I nil II II I I I I II II II lllllllllll 

S: 38675 TT-TATACAAAGACyUUVCATTTCTGTAAAOUlCGACAGCATAAATGAAGAAAakAA 38733 



Q: 1298 CACACACAaUU^CaUVGC-TCTAAGCTGerGTAGCCaX^AAGAAGA-CAAGAT^ 1241 

II II III II I I III I lllltl I lllll I I II 

S: 38734 CAAATAATTAAAGAAACATTTAAAGAAAAACATAAAGAAGAACAACAAGAATA-TAT^ 38792 



Q: 1240 AGCTCAGCCCAGGAAACAAAGGGTAAACAAAAAACT-A--AAACTTATACAAO-ATACCA 1185 

III I lllll I I II mill I nil I Mi Ml I 

S: 38793 TO-T-ATATAATAAAACATATCATGAAGAAAAAAAGCAGGAAACCAAAGTAAGCATAAAA 38850 



Q: 1184 TTTACy^C-TGAACATA-GAAT-TCCCTAQTGGAATGTCATCTATAGTTCACTCGGAACAT 1128 

im I MM Mill III III lllll MM II II 

S: 38851 TTTAATTATCAACTUVATGGATGTTAAAAATG-AATATTTTC-ACA- --O^CTTAAAAAAT 38905 



Q: 1127 CTCCCGTGGA-C-TT-ATCTGAAGTATGACAAGATTATAATGCTTTTGGCTTAGGTGCAG 1071 

II MUM M I I I II III MM I II MM I 

S: 38906 ATOUU^CTAATCATTCy^TTCAAAAT-TAATAATATTTTAATAAATATGTAT-ATGTATAA 38963 



Q: 1070 GGTTGCAAAGGGATCAGAAAAAAAAAATCATAA 1038 



S: 38964 CCCTT-ATAT- -ATGTGCAAAAAAAAATAAAAA 38993 (SEQ ID NO: 69) 



>gb|AC007370.7|AC0073 70 Homo sapiens, clone 22_A_3, complete sequence 
Length =. 176,426 
Minus Stramd HSPs: 

Score a 259 (44.9 bits), Expect = 0,0091, Sum P(2) = 0.0090 

Identities « 113/164 (68%), Positives « 113/164 (68%), Strand » Minus / Plus 

Q: 1424 TTGGAGAGAAAGA-AGGCTTTATTGTATC-AGAGCA-GTGCTAAA-ATTTCTAGGACAGA 1369 

mil I MM lllll III I II MM MM MM I I MM 

S: 82468 TTGGA-ACAAAGCCATGCTCTTTTATTTGTATATTATGTG-TAAATAITTTT-GCACAGT 82524 
Q: 1368 ACA-ACACCAGTACTGGTTTACAGGTGTTAAQACTAAAAT-TTTGC-CTAT--ACCTTTAA 1313 

I .MM MM I II I MM Ml IMMII Ml I MM II II 

S: 82525 ATTGACAC-AGTA-TAGTA-ATAGGTCCCAAGTCTAAAATATTTACTCTATTACGCTTCX5 82581 

Q: 1312 GACAAACAAACAAACACACACACAAACAAGCT-CTAAGCTQCTQ 1270 

I lllllllllllll llllillll II I Mil li III 

S: 82582 CA-AAACAAACAAACAAACACACAAAAAAATTGCTAATCT-CTG 82623 (SEQ ID NO: 70) 

Score « 213 (38.0 bits), Expect - 0.0091, Sum P(2) = 0.0090 

Identities » 105/162 (64%) , Positives « 105/162 (64%) , Strand « Minus / Plus 

Q: 363 ATAGAAAGAATGAAAA-AACCCTGCTGAATCATACAGTAATTTTCTT-TAAAGCACATAG 306 

III IMIII lllll III Ml MM I lllll III MM I 

S: 138890 ATACAAAGAAAGGAGATAATCCTAATGAC-CATAATGACAATTACTAATAATAAACATGG 138948 
Q: 305 TAGT-TACATAAATATATATATATAAATATATTT-TTGTTTArAACTAACACAAGGaU3- 249 

I II III 11111111111111 lllll I II I II II II lllll Ml 

S: 138949 CAAACTATATATATATATATATATAA-TATATATATTCTCTAAAA-TATTACAAGACAGT 139006 

Q: 248 GATCTTGTGACTCTAAGAGTQCGTT-TTGTCATCAAQACAAA 208 

II II III II I I I 111 II III III i 

S: 139007 GAAAATG--ACTGAAATA-TAATTCCTTG-CA-CAAAACATA 139043 (SEQ ID N0:71) 



124 



Score s= 128 {25.3 bits). Expect = 0.16, Sum P{3) = 0.15 

Identities = 74/112 (66%), Positives « 74/112 (66%), Strand = Minus / Plus 
Q: 572 TCCCCAACTGTTAAGTCCCA---TGA--AAC-CACyVGTTGCT-CTGGGC-TGATGGAAAC 521 

I 1(11 Mil I III III III III III III I II III I I 

S: 89248 TACCCACATGTTCTTTGCCAAAGTGAGCAACACACCCAAGCTTCTGCCCyVTG 89305 
Q: 520 AAAAGGAAACAGTATGAAGAGTTCC--TTAATCATTXT-TQAAACAAAAATG 472 

lllllllll Mil Mil II I IMIIIIill I I II II II 

S: 89306 AAAAQGAAAGAGTA--AAGAATTGCAGTTAATCATTTAATCTACCACAACTG 89355 {SEQ ID 
NO: 72) 



>gb|AC004605.l|Hl7AC004605 Homo sapiens Chromosome 16 BAC clone 
CIT987SK-A-248F7, complete sequence 
Length = 259,474 
Plus Strand HSPs: 
Score = 238 (41.8 bits), Ejqpect » 0.011, Sum P(2) « 0.011 
Identities « 448/815 (54%) , Positives = 448/815 (54%) , Strand = Plus / Plus 





678 


QTTCGGAAATTTTQTOVCAGCTCATAGAAAGGTCATTCATTACA^aaAAGAGQCAGACA 


737 


S: 


253238 


ill III 1 1 nil II iiinii III I I 1 n ni iii 

GTT- GQAGAGATGGTCAAAGAACATA- AAATTTCAGTTAGGAGGAAT- AAGTTCAAGAGG 


253294 


Q: 


738 


TGGTTTGAACCTCCTG-T-ACGGATAAAAAGATTCGTTCTGTAAAGTGATGAAC-TTGTT 


794 


S: 


253295 


II III II 1 11 1 11 III 11 1 11 1 III 1 MM III 1 

TGTATTGTACAACATGGTGACT-ATATTAATAATCATG-TGTCTTATTTTGAAAATTGCT 


253352 


Q: 


795 


GGTACAGATGGATCT - AGA- C - CAGAAAA- ACAACATCCTTTTQCCACACATCTGTTT- T 


849 


S: 


253353 


1 1 II Hill III M 1 1 MM II 1 II 1 1 Ml 1 1 

G--AGAG-TGGA'rTTTAGTGCTCCCACCACACAA-ATGGTAT-GTGAGGTAA-TGTATAT 


253406 


Q: 


850 


GAAAATGTAGAAAATCTCTTATTTTGT6TTTGTTCTCA.CTCTTGCAT(5GGC - TGTTTTTC 


908 


S: 


253407 


II III 1 II III II 1 II It 1 1 nil 1 1 

GTTAGCTTAGTTGAG-TC--ATTCTGa^TGAATATQTATTTCAAAACy^CATGTTGTAC 


253463 


Qs 


909 


TTAGCAATCTTGAGACATTCa^Ta^TTGCrCTTTTGAGTGAAAGGGTTC^^ 


968 


S: 


253464 


II 11 II III M III III \'\ III II 1 II Mill 

ATAATAAA" -T-ACACAAG-ATG-TTG-TCTAT-GTTAAAAACCCATCGGACT- -GGAAA 


253514 


Q: 


969 


ACAGTTATCTTTGA--GAaV-Ta^G-AGAACATTTGAACCAAAGGTGTCGATCCCTCATG 


1024 


S: 


253515 


nil MM 1 M 1 II III Mill 1 Mill III 1 II 

T-AGTTC-CTTTCACTGAAACTCTTCAQATCATTTTGTTTQAQQGTaTTT-TCCTT--TQ 


253569 


Q: 


1025 


AACTAAAGC-TTT-ATTAT-GATTTTTTTTTTCT-GATCCCTTTGCAACCC-TGCAC-CT 


1078 


S: 


253570 


1 II III III III 1 III nil n 1 III 1 nil 1 1 i i 

ATCTCCAGCCTTTTATTGTTGATGTTTTGGTTGTTGATTG- - TAGCAATTCATTTTCTCA 


253627 


Q: 


1079 


AAGCCAAAAGCATTAT-AATCTTGTCATACT- - TCAGATAAGTCCACGGGAGATGTTCCG 


1135 


S: 


253628 


III! 1 II 1 IIIIIM II II II Mill n II II 1 

AAGCTGCCC-CTTTTTCAATCTTQAAAT-CTGATCGGATAAAACGAAGG-AQQGAAAAQG 


253684 


Q: 


1136 


AGTGAACTATAGATGAC-ATTCCACTAGGG-AATTCTA-TGT--TCAGTGTAA-ATGGTA 

111 1 II II t II Ml II 1 1 II MM 1 MM 

AAAGAGTAAGAGGTAAGGAGGTCAAGAGGATAAATGAAGTGGGATCAGGCTGGGAGGGAA 


1189 


S: 


253685 


253744 


Q: 


1190 


TCTTGTATAAGTTTTAGTTTTTTGTTTACCCTT" TGTTTCCTGGGCTGAG- CTTGTCCAG 


1247 


S: 


253745 


MM 11 MM 1 1 MM 1 1 1 111 MM MM II 

TCTGQ-ATGGCTGTTATTCTGCAGGTTAGCATGCTAAGGGCTGTTCTGTGGCTTGAACTQ 


253803 


Q: 


1248 


AA-ATCTTGTCTTCTTCAGGCTACAGCAGCT-TAGAQCT-TGTTTGTGTGTGTGTTTGTT 

111 II II III 11 II M III M III lllllllllll Ml 

AACAACTGG-CACCT-CTGGAAAC--CTGGTCTAT-GTTGTGTGTGTGTGTGTGTGTGTG 


1304 


S: 


253804 


253858 
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Q: 


1305 


TGTTTGTCT - TAAAGGTATAGGCAAAATTTTAGTCT - TAAC - ACCTGTAAACCAGTACTG 


1361 


S: 


253859 


III II 1 III II II 1 Mill 1 1 1 Mil II 1 Mil 1 

TGTGCGTGTGTACACTCAGAATTCAA-TGTTAGTTTCTTCCTACCT- TA- - CTAGTATTC 


253914 


Q: 


1362 


-QTGTTQTTCTGTCCTAGAAATTTTAQCACTQCTCTGATACAATAA-AQCCTTCTTTC 

1 nil II 1 II II III lllllll 1 II 1 III II 
AGGGTTGC-CTTTATTAACCACTAGT6GAATCAAATGACACATTCACAGTAT-CTTG-TC 


1419 


S: 


253915 


253971 


Q: 


1420 


TCOUICTGGTTC-AACTTCAGCATAGGCAGGATGT 1453* 




S: 


253972 


1 1 iiiiii II HUM II 1 1 II 1 

T--ATGTGGTTCCAAAGTCAGCA-AGACTG-ATCT 254002 (SBQ ID NO: 73) 





Score o 236 {41*5 bits), Expect « 0.011, Sum P(2) » 0,011 



15 


Identities o 242/421 (57%) , Positives = 242/421 (57%) , Strand a Plus 


/ Plus 


Q 


: 122 


TTTQCTAAGTGACAG-AAAA-ATATTCT-A-ATATT-A-AGGAT-TATTTTACA-ACTCT 

III II 1 1 1 nil iiiii 1 1 inn i i n iiiiiii mm 

TTTTATATATTATATTAAAATATATTTTTATATATTTATAATATATATTTTATATAATAT 


173 




S 


: 95564 


95623 


20 


Q 


: 174 


ATGGAAG-TAATGCAGTGATGC-ATCTTGCaiTCTGTTTTGTCOTGATGAC^^ 

II 1 1 III 11 1 nil II Mill III 11 1 1 

ATAGTATATAAAAATGTATTTATATATTAAATACATATT-TAT- - ATGTAAATATATATA 


231 




S 


: 95624 


95680 



25 


Q: 


232 


CraAGAGTCACAAGATCCTGCCTTOTGT-TAGTTATAAACTAAAATATATTTATATAT^^ 


250 


S: 


95681 


III III M 11 1 1 II II .mil M iiiiiiiiini n 

ATTA-AGT-ATAC-ATT-TATAT-GTCyVATATATATAATTAATTATATATTTATATGTAA 


95735 





-Q:~ 


— 291 


-ATATT-TATGTAACTAC-TATGTGGTT-TAAAGAAAATTACTGTATGAT-TG^ 

Mil II 111 II III 1 1 1 II nil 1 1 III Ml II III 

ATATATAATTAAATATATATTTACATATATGTAAAT - A- TATATAATTAATTATA* TATT 


-349 


30 


S: 


95736 


95792 




Q: 


350 


TTCAT-TCT-T-TCTATCGCC-ATaC-Ta-ACGTGAATGGCTTCTTAGACAGG--AATCA 

Mil 1 11 III 11 III III 1 IIIII lit 
TACATATAAATATATATTAAATATATATTTATAT-AATATATAATTAAATATATAAATAT 


401 


35 


S: 


95793 


95851 


Q: 


402 


GOUITGAAATGGGTGTTTAGTACTGAAAAGGCATAAAATTAAAAATA- ATCATATOTT- T 

III nil 1 nil II n in i iin inn in n iin i i 

ATAATTAAATATATATTTA- TA- TGTAAATGTATAA- -TTAAATATATATXATATAATAT 


459 




S: 


95852 


95907 


40 


Q: 


460 


AAAATCCCTTAACATTT-TTGTTTCAAAAATGATTAAGGAACTCTTC-ATACTGTTTCCT 

Mil llllll 11 1 11 Mllllll 1 III nil 1! 1 

ATAAT- - -TAAATATATATTATATAATATATAATTAAATATATATTATATAATATATAAT 


517 




S: 


95908 


95964 


45 


Q: 


518 


T 518 




S: 


95965 


1 

T 95965 (SEQ ID NO: 74) 




50 


Score * 221 (39.2 bits), Expect = 0.047, Sum P(2) o 0.046 
Identities » 229/403 (56%) , Positives » 229/403 (56%) , Strand = Plus 


/ Plus 


Q: 


139 


AAATATTCTAATATTAAGGATTAT-TTTACAACTCTATGGAAGTAATGCaLGTGATO 

llllll nil ill II II linn i iiii n ii i ii ii ii 

AAATATA-TAAT- -TAAATAT-ATATTTACA- -TATATGTAAATA-TATA-TAATTAATT 


197 




S: 


95734 


95785 


55 


Q: 


198 


TTGCATCTGTTT-TGTCTTGATGACAAAACGCACTCTTAGAGTCAOUV^ 

IIII 11 lllllll 1 11 11 II 1 n 1 1 

ATATATTTACATATAAATATAT-ATTAAATATATATTTATA-TAATAT-ATAATTAAATA 


256 




S: 


95786 


95842 


60 


Q: 


257 


TQTTAQT-TATAAACAAAAATATATTTATATATATATATTTATGTAACTACTATGTGCTT 


315 


S: 


95843 


IIII mil III niiiiiiiiii II n i ii iii ii in i i 

TATAAATATATAATTAAATATATATTTATATGTAAATGTATAATTAAATA-TATATTATA 


95901 



126 





316 


TAA-AGAAAATTAC-TGTATGATTCAGCAGGGTTTT-TTCATTCTTTCT--ATCGCCATQC 


371 


S: 


95902 


III 1 1 Hill 1 III III 1 1 1 II 1 1 1 1 1 II II 

TAATATATAATTAAATATAT-ATTATATAATATATAATTAAATATATATTATATA-ATAT 


95959 


Q: 


372 


TGACGTGAATGGCTTCTTAGACAGGAATCAGCAATGAAATGGGTGTTTAGTACTGi^^ 

1 Mil 1 III II 1 II III nil 1 1 II II 1 II 1 

ATAATTAAATATATA- TTATATAATA- T- AT- AATTAAATATATTTATA- TAATAAATAT 


431 


S: 


95960 


96014 


Q: 


432 


GCATAAAATTAAAAATAATCATATGTTTAA- A" ATCCCTTAACAT - T - TTTGTTTCAAAA 

1 II lllllll III MM 1 III 1 II II II II Mill 1 II 

- C - TATAATTAAATATA- T- ATTTATATAATATATAACTAAATATATATTTATATAGTAA 


487 


S: 


96015 


96070 



488 ATGATTAAGGAACTCTTCATACTGTTTCCTTTTGTTTCCATCA 530 

II II I INI III III! II I Hill 

96071 AT-ATCTAT-AACTAAATATA-TATTTA-TATAATATCTATAA 96109 (SEQ ID NO: 75) 
Score « 196 (35.5 bits), Expect = 0.53, Sum P(2) = 0.41 

Identities « 226/398 (56%) , Positives = 226/398 (56%) , Strand = Plus / Plus 



20 


Q: 


141 


ATATTCTAATATTAAGGATT-ATTTTACAACTCTATGGAAGTA-ATGCAGTGATGCA-TC 


197 




S: 


96528 


III MM III III! 1 1 M II lllllll 1 III 1 1 

ATGTAATAATTATATQTCTTTATGTAATAAAGAAAT- -AATTATATOTCTTTATQTAATA 


96585 


25 


Q: 


198 


TTGCATCTG-TT-T-TGTCTTGATGACA-AAACGCAC-TC-TTAGA-GTCACAAGATCCT 

1 11 1 1 1 1 M II M 1 II MUM 1 lllllll 1 1 1 


250 


S: 


96586 


1 M 1 1 1 1 M i M 1 M 1 1 M 1 1 1 1 1 M 1 M 1 1 1 1 

AAG-ATATAATTATATGTCrTTATGTAATAAAGGTATATAATTATATGTCTTTATGTAAT 


96644 




Q: 


251 


GCCTTGTGT-TAGTTATAAACAAAA-ATATATTTATATATATATATTTATGTAACTACTA 

II 1 II Mill III 1 Hill iiiiiiii mil II II 1 II 

AAAG-GTATATAATTATATACATGATATATAATTATATATTTATATA-ATA-AAQ-A-TA 


308 


30 


S: 


96645 


96699 




Q: 


309 


TGTGCTT-TAAA-QAAA-ATTACTGTATQATTC-AGOIGSGTTTTTTCATTCT-T-TCTA 


362 


35 


S: 


96700 


II II III II 1 III MM II 1 1 1 II Mil II MM 

TATAATTATATATGACATATAATTATATATTTATATAATAGATATATAATTATATATCTA 


96759 


Q: 


363 


TCGCCATGCTGAOSTGAATGGCTTCTTAGACAGGA-ATCAGCA-ATGAA-ATGGGTGTTT 


419 




S: 


96760 


1 II 1 1 III 1 1 II 1 1 11 1 1 II 1 III Mil 
TTAT-ATAAATATAT-AATTA-TAOVTATCTATTATATAAATATATAATTATQTATATTT 


96816 


40 


Q: 


420 


AGTACTGAAAAGGCATAAAATTAAAAATAATCATATGTTTAAAATCCC-TTAACATTTTT 


478 




S: 


96817 


1 II 1 III III nil Mill INI 1 II II III! II 1 1 

ATTA-TATAAAT--ATATAATTTATA-TATTTATATAATAAATATGTAATTAA-ATATAT 


96871 


45 


Q: 
S: 


479 
96872 


GTTTCAAAAATGATTA-AGGAACTCTTCATAC-TGTTT 514 

III 1 III 1 II 1 II 1 1 III Mill 
ATTTATATAATAAATATATAAAAT-TAAATAXATGTTT 96908 (SEQ ID NO: 76) 




50 


Score = 195 (35.3 bits), Expect « 0.59, Sum P(2) ^ 0.44 

Identities = 215/384 (55%) , Positives » 215/384 (55%) , Strand = Plus 


/ Plus 


Q: 


127 


TAAGTGACAQAAAAATAT-TCTA-ATATTAAGGAT-TATT-TTACAACTCTATGGAAGTA 

II Mill 1 II Ml 1 1 MM 1 III Ml III 1 Mill 1 1 

TATATGACATATAATTATATATTTATATAATAGATATATAATTATATATCTATTATATAA 


182 




S: 


96708 


96767 


55 


Q: 


183 


ATGCAGTGATGCTIT-CTTGCATCTGTTTTGTCTTGATGACAAA-ACGCACTCTTAGAGTC 


240 




S: 


96768 


II 1 1 M M 1 1 MM II II IIIIIIII lir 1 

ATATA-TAATT-ATACAT--ATCTATTATATAAATAT-ATAATTATGTATATTTATTATA 


96822 


60 


Q: 

S: 


241 
96823 


ACAAGATCCTGCCTTGTGTTAQTTATA-AACAAAAATATATTTATATATATATATTTATO 

II II 1 Mill Mill II III II II III 1 llllllllllll 

TAAATATA-TAATTTATAT-ATTTATATAATAAATATGTAATTA-A-ATATATATTTATA 


299 
96878 



127 



Q: 

15 S: 



P&T/US00/35190 



10 



15 



Q: 300 TAACTACTATQTGCTTTARAGAAAAT-TACTGTATOAT-TCAQCaQGGTTTTTTCAT-TC 356 

III II II I I III WW Hill Mil I I II III!- 

S: 96879 TAA-TAA-ArATA--TAAAATTAAATATA-TaTTT-ATATAATAAATATATAATTATATG 96932 

Q: 357 TTTCTATCGCCyiTGCTGACGTGA-ATGGCTTCTTAGACUWSQAATCAQCAATGAAATOaQT 415 

III III I I I I III II I I II III III I III I 
S: 96933 TTTATATAATAAATATATAATTATATGT-TTAT-ATA-ATAAATATATAATTATATGTTT 96989 

Q; 416 GTTTAQTACTGAAA-AGGCATAAAATTAAAAATAATCA-TATGTTTAAA-ATCCCT-TAA 471 

I I! II III III I I II III I III Mil II I III 

S: 96990 ATATAATAAATATATAATTATATATGTTTATACyUlTAAATATAATTATATATGTrrATAA 97049 

Q: 472 CATTTTTGTT-TCAAAAATGATTA 494 

II II I I I III III 

S: 97050 AATAAATATAATTATATATGTTTA 97073 (SEQ ID NO:77) 



Parameters': 
V=10 

20 B«io 

ctxf actor«2 . 00 
E»10 

25 Query 

strand MatiD Matrix name 
+1 0 +5,-4 

QsroyR^io — 

-1 0 +5,-4 

Q=10,Rb10 



30 



35 



As Used 

Lambda K 
0.192 0.173 
-0Tr04 — OTOISI- 
0.192 0.173 
0.104 0.0151 



Query- 
Strand MatID Length Eff .Length 
+1 0 2668 2668 



"1 



2668 



2668 



H 

0.357 
*0T0600~ 
0.357 
0.0600 



Lambda K 
same same 
~ri/a n/sT 



Computed 



same 
n/a 



same 
n/a 



£ S W T X B2 32 

10. 227 11 n/a 73 0.024 82 

102 0.023 124 

10. 227 11 n/a 73 0.024 82 

102 0.023 124 



H 

same 

same 
n/a 



6. EQUIVALENTS 

40 Although particular embodiments have been disclosed herein in detail, this has been done 

by way of example for purposes of illustration only, and is not intended to be limiting with 
respect to the scope of the appended claims which follow. In particular, it is contemplated by the 
inventors that various substitutions, alterations, and modifications may be made to the invention 
without departing from the spirit and scope of the invention as defined by the claims. The choice 

45 of nucleic acid starting material, clone of interest, or library type is believed to be a matter of 
routine for a person of ordinary skill in the art with knowledge of the embodiments described 
herein. Other aspects, advantages, and modifications considered to be within the scope of the 
following claims. 
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WHAT is CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting of SEQ ID N0:1, a mature protein coding portion of SEQ ID N0:2, a mature protein 
coding portion of SEQ ID N0:3, an active domain of SEQ ID NO: 1 - 3, and complementary 
sequences thereof, 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim 1 . 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridi2dng under strmgent conditions 
with any one of SEQ ID N0:1. 
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11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 1 0. 

13. A method for detecting the polynucleotide of claim lin a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to forai the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim 1; and 

c) detecting~smd"pr^duct~^ the polynucleotide^f "claim T iiTthe 

sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide, 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufiBcient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufi5cient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compoxmd that binds to the polypeptide of claim 10 is identified. 
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18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 10 is identified. 

19. A method of producing the polypeptide of claim 1 0, compriising, 

a) culturing a host cell comprismg a polynucleotide sequence selected from 
the group consisting of a polynucleotide sequence of SEQ ID N0:1, a mature protein coding 
portion of SEQ ID N0:2, a mature protein coding portion of SEQ ID N0:3, an active domain of 
SEQ ID NO: 1-3, complementary sequences thereof and a polynucleotide sequence hybridizing 
under stringent conditions to SEQ ID N0:1, under conditions sufficient to express the 
polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of any one of the polypeptides from the Sequence Listing, the mature protein portion 
thereof, or the active domain thereof. 

2 1 . The polypeptide of claim 20 Miierein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1 . 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

• ■■ . . * ' ' ■ ■ 

24. The collection of claim 23, wherein the anay detects full-matches to any one of the 

polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of the 
polynucleotides in the collection. 
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26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format* 

27. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 and a 
pharmaceutically acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 20 and a 
pharmaceutically acceptable carrier. 

29. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 and a pharmaceutically acceptable carrier. 



30. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 20 and a pharmaceutically acceptable carrier. 
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